Hacker News new | past | comments | ask | show | jobs | submit login
The new business of AI and how it’s different from traditional software (a16z.com)
398 points by gk1 11 months ago | hide | past | favorite | 210 comments



About 6 years ago I invested in a ML as a startup company as an angel. The proposition was that they had a very nice workflow tool to take data, scrub it, partion it, build models against it, view performance and deploy it through and api. At the time there was nothing else out there really that was as polished and understandable to a business person. Then, very rapidly, the field got democratized. People were firing up notebooks, tensor flow popped into existence and their edge eroded. They were pitching to teams who had just hired data scientists who of course didn't want to buy something that made their job look easier! The other reality, which ties to this article, is that whilst they had to do a lot of hand holding with clients so the service cost became high although this wasn't custom model development but more training on the platform so they eventually worked out a way to package it in the sale as they got established. The company in the end changed course. They built out the product to focus on a single vertical that was getting traction in finance. This was huge contentious change and they lost some people along the way - but - it ultimately saved them from being wiped out. In the end the sold out to a much larger player in that vertical who was had no AI strategy and we all made a decent if not mind blowing return on our investment. Lots of lessons learnt on the way and it was a fun ride overall.


> The company in the end changed course. They built out the product to focus on a single vertical that was getting traction in finance.

Building a startup in the "AI" space myself since more than 2yrs now, I'm more convinced than ever that this one major elemnent of a successful strategy for such companies: focus on 1 vertical.


100%. Faster your realise and act on it the quicker you will find your path in front of you get clearer and your chance of success will shoot up.


> They were pitching to teams who had just hired data scientists who of course didn't want to buy something that made their job look easier!

I don't really follow. Nobody thinks a guy driving a forklift is working just as hard as a guy lifting crates by hand. But he's doing more lifting! Why wouldn't you want your job to look easier? The easier it is, the more you can do.


Because it’s bologna presented from an investor who has a huge bias in favor of their own product. Likely it was teams of data scientists who didn’t see the value add in their software. Having worked in the field, and seen plenty of ML platform pitches, they often promise much more than they can deliver and force data scientists to work in a box they have no desire to be in.


Yeah. I've seen so many vendors attempting to fix the data science workflow appear and disappear in the last decade. I'm not going to waste my time refactoring my workflow to fit some proprietary tool that could no longer exist in six months. Plus these tools are often built with an idealized conception of a data science workflow in mind, and their proprietary nature means that they tend not to be very adaptable/modular.


What if it wasn't proprietary?


If it's not proprietary and fits a need I have, absolutely! For example, Voila for Jupyter is a recent-ish tool that fits a need I have had from time to time and I will happily adopt it the next time that need comes up.


A lot of times it doesn't make life easier. It just looks easier to people signing the checks. And the people building/selling the products aren't data scientists so they don't realize they are not making life easier. They just see data scientists as a bunch of hard-nosed elitists who want everything for free.

I've worked as a data scientist on both sides of the table. As a customer facing data scientist for a product company selling to data scientists and for 2 companies that sell physical products and employ data scientists.

I've been in sales situation where I'm telling the sales rep the product we are selling isn't going to work for the data scientists we are selling to. And their response is to go above their heads because the Data Scientists "Are too technical and don't want to pay for anything."

I evaluate data science products on a semi-regular bases and many that "make my life easier" actually make one thing easier and 10 other things harder. Lots of these products are also new and they are trying to be competitive in a space that people are only starting to understand. So if something is broken I can't fix it. I'm stuck pointing fingers while my management gets frustrated with me because this solution was supposed to make life easier.

There's also the reality that right now open-source is king and if I want to hire somebody it's much easier to find people with those skills. And yes, I have a selfish desire to use open-source tools because I want to be employed in the future.

Don't get me wrong, there are product areas (e.g. Data Science DevOps) where it absolutely makes sense for a Data Scientist to buy something to make their lives easier. And a lot of them do.


The easier data science is, the more a business wants a domain expert doing it instead of a freshly graduated PhD.


The business is correct to want that. Unmotivated statistics are not good guides.


But the freshly graduated PhD does not, and is usually the person in charge of making the decision of whether to use the software initially when the field is new. Which would be a case of protecting their own job at the detriment of the company


I've personally not seen this happen. Instead, I've seen many oversold ML/AI tools that offer no advantages over using open and freely available tooling that the data scientist (with or without PhD) is already familiar with. I know this must be frustrating to ML solution vendors, but as with any product, the value proposition has to be there and easy to see for the domain expert. And the value has to be great because the downside is the vendor lock-in of the proprietary solution. Hence, ask yourself: given a green field real world ML project, would you use your tool (with the the same learning algorithms, data manipulation methods etc. under the surface as everybody else), or resort to using some battle proven free and open stack.


Imagine for a minute that you are new on Earth, first day, and you are in a car as a passenger on a highway. You watch the driver look, hold the wheel, use the brake and drive.. It seems very important what the driver is doing, so you notice many small details. However, you have no idea how much the car weighs ! or on a hill that force changes ! You watch the driver so carefully but have no idea about the basic weight of the car..

Now, you are a business person in a room with your thoughts. A presenter talks and you listen, but you watch the posture, the haircut, the order of speech .. very carefully But you have no idea about how the data moves, or the learning curve to use the tools, or even more how to innovate against "ordinary" .. you have no idea ! How could you.. so you watch the presenter carefully..

Basically, everyone has an idea of lifting a box.. so no matter how detached your life in the office is, you can appreciate a fork-lift. However, you have no idea about thirty years of *nix development, the toolkit evolution, the language wars.. etc

Here is the hard part -- many small-minded people (who run money) think NOTHING of learning.. its not important.. they care about control of the situation, and who gets the profit. Some leaders actually cultivate a smug disdain for "workers" .. some of those leaders have money.. etc...

Know this well - it is hard to believe how true it is, but perhaps you will find out over the years.


That's another perspective on bike shedding.

http://bikeshed.com/


How does your comment relate to my comment, which you're supposedly responding to?


Because the forklift driver isn't a specialist position that they will try to undercut by using the tool. If you need a full data scientists without the tool, maybe a fresh junior developer can do the same with the tool. It won't work, but that doesn't help the data scientist was was replaced along the way. Eventually knowledge builds up as to what sort of skills are needed even with that tool, but until that happens there is a chance the tool was oversold and that will result in either higher demands on the employee, an attempt to replace them, or something of that nature such less budget when growing the team.


To me, this sounds a lot like a justification built upon an bias (i.e. of course the tool the start-up was trying to sell was what the potential customers needed, it must have been those darn data scientists preventing the sales).

From my experiences at the time, there are a lot of such start-ups around, most of which did not actually entirely (or often at all) meet the needs of their customers, and were promising a lot more than they actually delivered. Whereas having data scientists in your organisation meant that at the very least, they were able to develop a proper understanding of your business, and actually create real value.


>Why wouldn't you want your job to look easier?

Automation has taught us that if your job looks easier, it takes substantially fewer of you to do it. Data scientists, in this instance, seem to be protecting themselves from what has happened in manufacturing for the last 50 years.


Well we all work with data scientists and can firmly attest this is just false.

They will work with whatever let's them focus on the actual domain problem, or a toolchain they are familiar with, or with something forced on them by org (while ranting). You know, exactly like developers


> or with something forced on them by org (while ranting).

Too true.


Was going to answer this sooner but others beat me to it! I went to a few prospect pitches where some of the newly minted well paid data scientists worked overtime to justify that what they do is unique and could not be 'commoditized' through our product. In the end pitching to business or C level worked better.


Why wouldn't you want your job to look easier?

Because you want to be highly paid?

In your warehouse example, both guys are being paid hourly, but the forklift license holder makes more, because he is clearly both more productive and more skilled.


Sure, and his job is easier and looks easier.


his job is easier and looks easier.

Are you sure? Anyone can shift boxes by hand it requires no training or experience whatsoever. Whereas operating a forklift is a skill that you have to be trained in and are tested on.


Also increased responsibility. You can seriously injure or kill someone with a forklift (I’ve seen someone drive the forks through the side of a fully loaded Cisco router once...)


Bad analogy. Data science isn't manual labor. Ask yourself, does an artisan woodworker want the shop to buy a CNC mill that they'll be obliged to use to justify its cost, knowing that they can be replaced more easily as a machine operator than as a skilled artisan?

They would probably rather advance their skills in woodworking than as a specialist in XYZ Woodco button-pushing.


Because with forklifts you need fewer box lifters. That's bad for the box lifters and so they will reject the technology.


Thank you, a very interesting story with a lot happening, told with only a few sentences.


Dunno if this is the startup I'm thinking of, but if it is, I'd attribute the exit entirely to the VC dude who took it over who obviously knew what he was doing. The founder-CEO was great at math, but it didn't look so good otherwise.


Great experience. Would like to understand more from your experience as we build our own venture in the super computing space for data scientists and AI workloads.


>The proposition was that they had a very nice workflow tool to take data, scrub it, partion it, build models against it, view performance

question - what is the best tool in this exact space right now ? (minus deployment)


MLFlow


was this evolution.ai?


Ya know what, AI might be the most handy-wavy term ever adopted in the tech world (and this is compared to big data, IoT, blockchain, etc.)

No one really can really define what precisely AI is. Just in autonomous driving alone, there are 6 different "levels" of AI. If I write a decision tree algorithm, is that AI?

My point is, you can basically use the term AI to justify virtually anything when it comes to the value of software, which just makes articles like this not valuable at all. Why not just call it "this is what software is capable of doing for a business"?

One thing is for sure about "AI", and really just "advancements in software", is that repeatable human tasks are and will be replaced by automated systems. The only thing left for non-software business interaction will be to simply deal with other humans beings.


I would say the current wave of AI is just training and inference.

Training is getting computers to recognize a pattern by giving positive and negative examples. It uses huge compute and you get a model.

Inference is using that model to recognize that pattern in new data. It uses less resources, so it can process lots of data quickly or work on a small system.

So when I hear hand-wavy stuff, I just sort of mentally think - what pattern are they looking for, what is their source data and what data will they use it on.


"Training is getting computers to recognise a pattern by giving positive and negative examples"

I think you are only talking about binary classification. Even multi class classification is at a significant state of the art level now. But yes, ultimately this is a pattern recognition process.


Hence the AI moniker is just the wrong word to use I think. I like ML a lot more, or even better would be Machine Cognition Learning.


My go-to definition is "machine enhanced statistics"


If you take a look at the papers in this field[1] you will see the almost never mention the words 'artificial intelligence'. It's 99% in the press, peddled by journalists trying to exploit every angle of sensationalism and fear.

[1] http://www.arxiv-sanity.com/


It's because "artificial intelligence" is a very broad expression and doesn't say a lot when you're in the domain. You won't find the expression "computer science" a lot either, it doesn't mean "computer science" is a sensationalist expression only used by journalists.


The names of the conferences and journals where they are published often have the term in them (see AAAI, IJCAI, JAIR, ECAI, etc.) so it's not like the term is not recognized in the scientific community.


Press is only half of the picture. The other half is investors being attracted to AI like moths to a flame.

And then there's the feedback loop. Media keeps telling that "AI" is going to change the future (for better or worse), which makes it seem important. This creates a fertile ground for investors to make money (whether because they believe a product will be revolutionary, or that they can sell it to a greater fool before it blows up). Resulting influx of money incentivizes everyone to try and attach "AI" to anything they can, no matter how hare-brained it is, in order to capture some of that money. Then the media notices all these new companies and report on how AI will change everything, for better or worse. Lather, rinse, repeat.


> My point is, you can basically use the term AI to justify virtually anything when it comes to the value of software, which just makes articles like this not valuable at all.

The article has SOME value!!!! E.g., a nice list of what can go wrong!

I won't defend the article or AI, but the article did at least hint at what it meant by "AI" -- training from data or some such. Or, one might say, large scale empirical curve fitting justified from its performance on the test data (there are some questions here, too, i.e., maybe some missing assumptions).

So, it appears that the article was relatively focused on what it regarded as "AI" and did not fall for the hype practice of calling nearly any software (from statistics, applied probability, optimization, simulation, etc.) AI. Uh, is the prediction of the motions of the planets and moons in our solar system from differential equations and Newton's laws "AI"? How about other applications of calculus, e.g., a little viral growth model

y'(t) = k y(t) (b - y(t))

Is that "AI"? I'd say no, but then want other work called AI should not be?


To be honest, it's the same in research. I'm an AI researcher, and was before the current boon.

AI in research, as best as I can judge, means "Solving a problem where there isn't a simple polynomial-time algorithm which produces a predictable answer". I once heard someone say "Problems are AI until we know how to do them".


> "Problems are AI until we know how to do them".

Philosophy sits in a similar niche. Physics, Chemistry, etc were all philosophical problems until we learnt better more specific ways to study them.


I really like this analogy. I'm reminded that Roger Needham used to say "If we knew what we were doing, it wouldn't be research." (citation: overheard from co-workers of his, also attributed to Einstein)


Those were metaphysical problems. There are other branches of philosophy that don't necessarily get handed off to science, like ethics and aesthetics, or existential questions and what makes life worth living.


You can stretch the term AI to refer to a lot of things, but not as far as I think you're implying. Even today, the vast majority of software written is unambiguously not AI, and the substantial majority of startups couldn't be called AI companies even as a dumb marketing tactic.


Also, some of the competition for most overstretched buzzword is pretty stiff. For example all of those "block chain" applications that are equivalent both philosophically and practically to one trusted party with an SQL database.

I don't think IoT is stretched at all, the name implies "refrigerators but with GSM" and "cars, but they report your every move to the manufacturer," and that's what we have. They're "things" (objects) on the "internet" (a data collection system for marketers).


> practically to one trusted party with an SQL database.

Nicely put. For some months I posted essentially this description to Fred Wilson's AVC.COM. Good to see that I'm not the Lone Ranger in this view.

IIRC I also mentioned that SQL database could be both fault tolerant, including redundant, and distributed. Also last time I read on SQL (e.g., Microsoft's SQL Server) it seemed that the opportunities for security, authentication, capabilities, access control lists, and encryption were good.


Majority of software written today is code bureaucracy, linking together people and decisionmakers.

"AI" can be stretched very far. If I apply linear regression on user-entered numbers to suggest them something, I can call it "AI" and be less dishonest with it than a lot of startups on the market.

(LR at least isn't a glorified RNG, which a DNN can become if one's not careful, or apply it to a domain that doesn't give obvious, visual feedback.)


No dataset, no training, no layers = not AI


Classically speaking, any time you gave computers a goal to optimize for, it was called "AI"; a large amount of AI research in the 80's and 90's was about how to optimize search. Thus SQLite's website, for instance, (correctly) says: "The query planner is an AI that tries to pick the fastest and most efficient algorithm for each SQL statement." Optimizing database queries was a common topic for artificial intelligence research in the 80's and 90's.

It's just that now we're so used to the idea of a computer searching for the best flight, the best map, the best way to optimize your code, the best way to execute your SQL query, that it doesn't seem like "magic" any more.

When you say "no dataset, no training" you're talking about machine learning. And when you say "layers", you're talking about deep learning.

[1] https://sqlite.org/queryplanner.html


Genetic algorithms might have fallen out of favour recently but I don't think anyone would exclude them from AI as a category. The field in my experience has always contained a lot of planning and inference which you seem to be excluding.


LR absolutely has datasets and training. You train a regression model in much the same way as you train any other ML model, DNNs included - you use some data to derive its parameters, use the rest to validate.

LR typically has number of layers = 1, though I bet someone imaginative enough could create "deep linear regression", if that was useful for anything.


The guys who coined the term "AI" strongly disagree.


> If I write a decision tree algorithm, is that AI?

It depends how that decision tree is produced. If you wrote it all by yourself, it's not. I mean, you just wrote a plain algorithm by yourself. A decision tree is a program, and vice versa.

If, OTOH, the tree is inferred from examples (good old CART for instance) or if it is generated on the fly (good old alpha-beta for instance), it definitely AI.

And actually, I'd say it's a good definition for AI. AI starts at the moment you don't write the decision trees / algorithms yourself but, instead, write a program that will produce the actual decision tree / algorithm.

Writing a program? Not AI. Writing a program that writes a program? Definitely AI.


I would disagree. What you're describing above is the difference between machine learning and AI. AI is a broader term that encompasses machine learning.


Nope. Alpha-beta (which I'm talking about in my post) is definitely not mahine-learning, for instance, and yet it is definitely an AI technique.


I would agree that it is. I would also say that handwritten decision trees are AI. The method of creating them wasn't AI but just Intelligence. The resulting program is a form of artificial intelligence. What do you think :)


>Writing a program that writes a program? Definitely AI.

So do you think the following is AI? print("print('Hello World!')")

It is a program which writes a program, so by what you've just said, it's AI...

Aside: For people interested in programs that write programs (that write programs, that write programs, that write programs ...), the following repo might be interesting: https://github.com/semitrivial/ions


You're taking my sentence a bit too literally there. Take it as a heuristic, not as a formal definition.

Anyway nobody's able to give a compact definition of AI that cannot be destroyed by a ton of counter-examples.


Compilers are AI?


"Plain" compilers are just translating from one notation to another, so they aren't.

But optimizing compilers that analyze your code and find that portion of your code can be rewritten to make the produced executable more efficient? I'd say it sound a lot like AI, yeah. I woulnd't be shocked if someone told me "gcc's -O3 option uses AI techniques".


If you have a compiler that can write a correct program from examples of the desired output, definitely yes.


That's usually called program synthesis, and a lot of it uses formal methods. It isn't usually considered AI (but recently the field started incorporating a lot of ML, like pretty much anything in CS these days.)


Correct me if I'm wrong, but I think program synthesis works more from writing very general formal specifications, rather than learning from a sample of desired outputs.

The latter is called Programming By Example, and the pattern-finding process needed to infer generalities from specific samples does benefit from machine learning techniques.


"programming by example" is a form of program synthesis, where the spec is provided by the examples. It is a formal specification, albeit incomplete. Program synthesis, as a concept, does not require the specification to be complete.


I have seen massive failures where management got sucked into AI hype and failed to realize most basic of advantages. Hired for fancy titles like Data Scientists etc. who were great in building simple prototypes but failed every time when met with real world use cases.


It is said that the true definition of AI is: "Awaiting Implementation."


This field, call it what you want, is exploding with papers, implementations, datasets, frameworks and courses. By any name you want, it is quite useful and has many applications. But if you prefer you can just get hang up on the term AI.


Or augmented intelligence, because that's what we're really doing with computers. Augmenting human intelligence, not replacing it.


Anything that is function approximation is AI. This includes decision trees


Not necessarily. Numerical integration of differential equations is function approximation. Yes the algorithm you would use may be an “artificial” implementation of “intelligence” but it is not AI in the sense people usually think of it today.

The bigger picture is that function approximation is ubiquitous in applied mathematics, whether it be numerical calculus or statistics. Machine learning encompasses the state-of-the-art, big-compute techniques that have made statistical function approximation so much more useful.


I'd say code generation is AI. That would make Lisp macros a degenerate case (though a Lisp macro can implement a decision tree or a DNN). It would also make DNNs and classical ML both degenerate cases across a different dimension - you have a fixed box called "bunch of linear algebra" with lots of input parameters, and code generation amounts to determining the best sets of those parameters (remember, code = data). Genetic programming (not genetic algorithms) would be in the middle somewhere, because it involves generating Turing-complete code and not just numbers.


Right, it's a buzzword.

Buzzwords are very similar to phatic expressions (aka small talk). They're words that don't convey much information, because they lack stringent definitions. As so, they do little other than feed confirmation biases. Which is why you often find these types of terms effectively used in marketing... and amongst those in management...



AI is just "if" statements with good marketing.


AI - amplified "if"s


This is a really good article. One company that I always think of when I think machine learning is the computer vision "startup" Clarifai. At one point in time they were cutting edge, filling the need for large scale image classification that enterprises had. This was when computer vision neural network architectures were rudimentary and hard to train (they still kind of are). Then in-house data science teams sprang up, tooling got better, better network architectures came out, and Clarifai essentially lost all of their edge over night. Machine Learning in itself is basically never the edge, it has to be a unique data set or sticky user base, something else that builds a moat.


I was one of the first few employees at Clarifai so I can add to this.

When Matt and Adam started the company there was no Tensorflow and outside of Hinton/Bengio/LeCun triangle nobody was doing deep learning yet. Matt just beat Google on ImageNet (around the time when he was doing an internship at Google Brain under Jeff Dean) and was one of the few experts in the field as he was lucky enough to have Hinton as his masters thesis advisor at UofT and did a PhD at NYU under Rob Fergus and Yann LeCun.

We had a clear technological advantage for about 2 years but thanks to the open nature of deep learning research (arxiv and willingness to open source code) the whole field caught up. Google giving out tensorflow and pretrained models for free made a bit of a dent as well.

A big problem with the "machine learning model as a service" business model is that each customer has slightly different needs and a different source of data so an off the shelf solution is usually not good enough. Because of that you end up being forced into doing consulting for large companies that don't know how to hire machine learning engineers. You end up spending months building them a model that they won't even know how to use and move on to the next customer.

IMHO there are only two viable AI business models right now:

1. Spin out your research lab into a company and keep churning out papers without ever thinking about having real customers and make enough noise to get acquired by FAANG, ala Deepmind, Metamind, etc.

2. Find a problem with a lot of repetitive manual labor and slowly wedge a machine learning model into the process. Design a good feedback loop by first augmenting the workers and use their feedback to keep improving the model until it's good enough to replace them. Doing so requires you to actually build a real product in that domain so you'll need much more diverse team than a paper mill. Most companies doing this shouldn't even call themselves "AI" companies since their customers don't care how the solution works as long as it solves their problem.

I believe that companies pursuing option 2 will take over a lot of large established players because building machine learning driven products requires buy in from the whole organization and it's not something that's easy to do at large enterprises. You need to be able to design the product in a way that helps you collect feedback, build data infrastructure to collect it and be willing to accept a solution that won't always be right.


Awesome to hear from somebody close to the source. There's a third option. There are entire industries where automation has not been practical because of the necessity for many hardcoded rules - problems outside of bland image recognition, including those solved by novel architectures like GANs and encoders and such. ML has finally gotten to the point where complex heuristics can be learned to automate those tasks which were impractical previously. Now you can legitimately write and sell ML services which meaningfully analyze, catalogue, and search data in ways that are currently human intensive, but just unique enough that regular programming won't cut it. There will be a proliferation of such businesses in the near future. The first wave is in development now. I've said it repeatedly, and I'll say it again, we are on the verge of an internet-like change in society. ML is poised to take human endeavors to new heights in the next decade...and if innovation and hardware continue to progress, I think the recent proliferation of the ML zoo and associated theory has given us the foundational tools for true AI which we may see in our [distant] lifetimes.


I am not sure what the difference between this and the "option 2" in the post you replied to. Some examples would be helpful. It seems to me both options replace human intensive tasks.


Would you share any examples of research or companies that are doing this?


Thanks for the info.

What do you think about the auto ml business model?

I.e. give each customer the ability to create its own models (using thier own data), but automating the model training and deployment?


I think it's a great way to bill enterprise customers for a ton of compute.

The intersection of people who can't train a machine learning model but can properly use and evaluate one is really small. Doing a brute force architecture search to squeeze out an extra percentage point of accuracy is not that useful.

Few shot learning is a much more interesting proposition and it's something that Clarifai offers.


"AI companies simply don’t have the same economic construction as software businesses"

Last I checked, "AI" is software.

Nowhere in here did I read anything about the fundamental truth of modern AI in production:

[AI] is a feature inside a successful product.

There are no "AI" companies, or "AI" products. There are companies which provide services to do inference on data, or in some cases tools/platforms so you can do it on your own.

They also confuse me by using the term services to mean bespoke Non Recurring Engineering efforts, instead of including something like a REST based API service that reflects most "pure" AI companies. Amazon Rekognition or MS Cognitive Services API are perfect examples of SaaS like AI services, but they aren't products exactly because they are used inside some other product as a feature.

At the end of the day, if you look at successful uses of AI in products they are one tiny piece of a larger product, that helps the product scale where it couldn't have before. That's pretty much the only place where it is proving very successful.

Even then, more and more of that is being done in house with the rise of things like Sagemaker and other turnkey ML inference tools.


I agree that this is a common misconception people have about ML and how it fits into our world. If you want to see successful ML products, you don't have to find some AGI stealth mode startup—just look at your phone:

- Gmail's Smart Compose - Netflix's Recommendation Engine - Uber's ETA Prediction

ML functionality is becoming a standard feature in software, and in my opinion, that's the real "ML Revolution." As you mention, turnkey inference tools like Cortex (full disclosure/shameless plug: I'm a contributor) are making this accessible to virtually all eng teams.


The AI companies the article is referencing are trying to provide those examples: smart compose, Netflix recommendation engines, etc. to customer companies without an army of PhD’s. However the article makes it clear that doing that in a general way is hard, much harder than a normal SaaS business.


I don't disagree that providing ML-features-as-a-service in a general way is much harder than a normal SaaS business. However, I don't necessarily believe those sorts of companies are the future of commercial ML. Rather, I believe practices like transfer learning and improved infrastructure (see the turnkey inference platforms mentioned previously) are enabling startups without armies of PhD's to build these ML-powered features themselves.


It's a difference without distinction.

The services that ML based SaaS are offering are currently harder than the majority of other services.

However you could have said the same about any number of services/products over the years that were in the early adopter curve, which I will state confidently that we're still in for ML.

At the end of the day it's a hard SaaS just like VoIP and search and geoservices once were.


Yeah, the article doesn't say don't do a ML SaaS. Just that it won't have 80% margins like a typical SaaS business and you have to specialize to give you any sort of scale or it's just a Service business.


This is not a tech focused article, it's a business / investor article. It doesn't care about the definition of AI from a tech or software perspective - it cares about classifying types of companies (SaaS vs AI) so that it can make reasonable assumptions about expected performance of those categories and create boundaries / guidelines for how to think about investment strategy.


> it cares about classifying types of companies (SaaS vs AI)

Investor-focused or not, that seems like a category error.


This is exactly my point, it is a category error in the worst sense as it's creating a category "AI company" that would be charitably described as poorly defined.


My point is that it's poorly defined from a technical perspective but well defined from a business perspective.

If you define AI company from what AI means from a tech perspective, then yes, it's a useless categorization. But if you define AI company as any company that makes money through offering optimization through data as a service, then I think that's a useful definition, in that you can lump many companies together and make sensible generalizations about their business models, which the article does:

Saas business model - 80% margins, exponential scaling, 0 marginal costs, competitive moats through tech

AI business model - 30-50% margins, linear scaling, higher marginal costs, data isn't a great moat

Under the hood it could be deep learning or basic statistics, but it doesn't matter what technology they employ, the method of making money for these types of companies is the same.

(In large/mature markets though, it matters a lot what tech you use since these methods can be copied by your competitors and improved upon - you can't use basic statistics to beat top of the line models forever, so most if not all data companies can be treated as eventual AI companies. I suspect this is why they use the term AI company rather than data company).


> "AI companies simply don’t have the same economic construction as software businesses" > Last I checked, "AI" is software.

You might be underestimating the importance of data to apply that software onto. I might have some real estate ideas but unless I get some data about properties and actual prices, etc I am just writing a research paper.

Maybe in the US real estate data is more accessible but in other places very much not or it is very expensive. Just an example.


I like the hard-nosed business evaluation of this article. However, the difficulties of a successful “AI” company are the same difficulties a “web” company had in the 90s. How many “World Wide Web” (back when we used that phrase) products survived the bubble? And how many people made a lot of money while creating software of very little value? People understood, correctly, that the “Net” (sorry, can’t help my nostalgia) was a breakthrough technology that would change everything. But very few people understood how to use it effectively in that moment to provide business or consumer value.

The same is true today for deep learning models. The tech has advanced to the point where there are clearly products waiting to be built upon advances that are largely open-sourced (if not pretrained). But building a product that is useful hasn’t changed from being a very difficult task to easy simply because a new technology exists. Moreover, as the article points out, costs in terms of dollars and manpower are currently higher than a making another web or mobile app.

All this should make hackers salivate. The higher than challenge and the more advanced the stack, the greater an opportunity becomes.

A useful way of thinking about this moment is that we have another platform to launch products from: first desktop, then web, then mobile, now “AI”. Each platform enabled a new kind of product. Unlike the other platforms, which were essentially hardware mechanisms of software delivery, ML/DL is a software mechanism of delivering software. It is a fundamentally different way of creating software that produces fundamentally different kinds of software — the kind you couldn’t make any other way.

The article is right to throw the cold-water of reality on the hype of AI. But the backdrop is one of breakthrough technologies that are just waiting to be leveraged.


> the difficulties of a successful “AI” company are the same difficulties a “web” company had in the 90s.

I had a similar thought reading the comments here about AI hype. In the mid to late nineties, there was a lot of hype about the Internet, and a lot of smart people dismissed the Internet as nothing more than hype. In the end, though, the Internet changed our world. I suspect that, eventually, AI will do the same.


It is increasingly clear to me that vast majority of programmers do not understand the profound difference between modern Machine Learning and software development.

AI may just be a buzzword, but neural networks and ML are not. The tech has just arrived, this is fresh, evergreen territory, and the first wave of applied machine learning is quickly approaching, in quiet development across industries. Things are about to get really interesting with the academic explosion of neural network architectures and theory, and I would strongly recommend to any developer to spend a few months getting up to speed!


DL has not just arrived. It has been going on since before 2010 in industry and universities. There are multiple cohorts of PhDs in it.

DL is useful obviously. Some of the biggest products in the world depend on it for almost their entire value propositions e.g. YouTube, Netflix, Tesla. (This is a power law; the long tail of commercial AI are jokes or borderline frauds.)

But I wouldn't recommend already practicing programmers to FOMO into the field this late in the game. First of all, it's very much a different animal. ML is in the statistics/continuous world whereas programming is in the algorithmic/discrete world. Knuth said something about how it's a different type of mind (a more normal one) that's attracted to ML, not the same as the 2% computer mind. Not only will you be doing something you're potentially not as talented at, you will also have to start from scratch up against the seemingly endless stream of ML specialist PhD candidates.

You can implement neural nets from descriptions in papers and blog posts. Honestly though there's not much to it with all the dead simple libraries e.g. Keras, Pytorch.

Putting together the whole system is a much harder problem and requires actual software expertise.


I honestly think if you are already a software engineer and you want to get into a ML company you should focus on building infrastructure for big ML products.

It seems the infrastructure game is very new for realistic, production ready, machine learning. I mean Uber was the first to really describe the idea of having a feature store back in 2017 with their Michelangelo[0] framework. We are just now starting to see feature stores pop up as a SAAS model (Hopworks, Google's feast, etc.)

I think we'll see "Machine Learning Engineers" start to pop up like crazy the next 2-3 years as people with a mixture of data engineering, machine learning, and software engineering experience begin to take these infrastructure roles.

[0] - https://eng.uber.com/michelangelo/


> I would strongly recommend to any developer to spend a few months getting up to speed!

As someone who spent more than a few months on a couple courses on various platforms, a few months is NOTHING in this field.

ML is not something where most people out of bootcamps can make a living out of.

ML is the ultimate winner-take all technology.

Sure, there might be some fast.ai examples of self taught ML scientists, but it is orders of magnitude harder than software development, mainly because of much longer learning feed-back -- you can't just printf or debug your code and expect to learn something new every 5 minutes.


> ML is the ultimate winner-take all technology.

Hence the flood of ML grads which will only get bigger and bigger as time goes on.

As a general observation of business, returns per unit of labor can range from completely linear (factories, hospitals) to more nonlinear in things like software or art. ML researcher is on the far end of this. Yes, Bengio and Goodfellow get paid 7 figures, but that's because even if you pay 100 amateurs 1/100th of that, they will still produce worse work.

If you want to be a good ML researcher, the time to get in was 10 years ago.


For broad research yes, I think there are still so many niches though where if you can combine domain expertise with good knowledge of ML you can achieve great results, simply because there will not be many people with this skill combination.


I have many physicist colleagues that are very successful in ML, and I think most of the people that work in the field today don’t have a formal ML education (as those just became popular/broadly available 5 years ago).

Deep learning brings some new paradigms to the table but it’s not fundamentally different from other optimization / ML methods, so with a background in statistics, math, natural sciences or quantitative economics (to name only a few) and a good understanding of programming (which most people from these fields have) you can do just fine. If you start out with zero knowledge of university-level math and no programming ability it will be much harder of course.


> but it is orders of magnitude harder than software development, mainly because of much longer learning feed-back

The order of magnitude point is just wrong. You might argue that it's harder because it's essentially software development and a lot of applied math. But that certainly doesn't account for orders of magnitude difference.


I was talking about novel/real-world applications. With software, you can pretty much know in advance what will work, ROUGHLY how much it will take and how much it will cost. With ML you have a high chance that ML will not work at all for your problem, or that YOU won't be able to solve it.

I'm a java developer and during my first 3 years I was able to investigate bugs not only in my code, but inside jboss or hibernate ORM; I can look up core java code and understand it just fine.

How many of the ML self-taught crowd can write framework-level code, or debug a ML algorithm bug?

Fooling management wanting to get into ML with some scikit code is easy, mastery of ML is orders of magnitudes harder than mastering a programming language/frameworks.


> How many of the ML self-taught crowd can write framework-level code, or debug a ML algorithm bug?

This isn't specific to ML though - I don't see why the number should be different from the number of self-taught web developers who can write framework-level code.


On the contrary. ML is the new web-development. Anybody can do it. Sure, for some specialized cases you need an expert, but those cases become increasingly hard to find.


Let me guess, you are heavily invested in ai/ml and are well aboard the hypetrain?

What's about to happen is that people who have been convinced to invest in ML infra are about to start pulling budgets because they've had no results.


There will be a lot of experiments that won't work, but there will be some that do. They're going to be the foundation for something radical and new.

We're clearly entering a world where computers can make complicated, gut-feel decisions. This is an amazing new capability; with the fall of the best human Go players there are no longer any well-defined tasks that a human can outperform a computer in. If we can simulate a situation correctly then computers are now better than us.

It will be challenging for this new capability not to not reshape all our control systems over the next two or three decades.


>there are no longer any well-defined tasks that a human can outperform a computer in.

If by "well-defined" you mean "complete information", then yes. If by well-defined you mean... well-defined, then no. Humans are better at a lot of things still, including video games for example. OpenAI has been working for years on a DotA AI, and while very technologically interesting, it still can only play (well) with and against a fraction of heroes and mechanics, and even within those limitations humans have quickly been able to find weaknesses and outmaneuver it.

AI really hasn't over the last 5 years (and probably won't during the next 5) changed the world that much. There are some problem domains that saw big interesting leaps (mainly image recognition, image reconstruction and game AI), but that's about it.

Self-driving cars, if they ever come to market properly, will be the biggest change. And that's certainly a big change, but it's not really the "software industry revolutionizing" change that some people seem to expect.


I'm surprised by your example; OpenAI has basically shown that humans can't compete at DotA. The bot outperformed humans in an unusual but reasonable restricted version of the game and it is obvious that it would do just as well at the full game excepting that OpenAI didn't have the stomach to try and tackle the engineering challenge for no reward.

They weren't even trying to abuse the reaction time of the computer, or train one agent to coordinate a team as one entity. There isn't really a question; if humans needed to win games of DotA for some reason we'd send machines. It just happens that DotA is a spectator sport and that would be boring.

Exploiting the AI ... maybe? The experience with AlphaGo is that peak computer performance is just going to be better than peak human performance. Dota isn't fundamentally more complicated than Go. A few orders of magnitude harder to engineer, but OpenAI's work suggests they're about 20% of the way through a 10 year process and they've covered the interesting parts - so they stopped. With more resourcing that could be solved quickly but it isn't enough of a challenge any more.

There isn't anything left to prove in DotA. Bots can play the game really well and would crush humans if it was important to anyone. Nobody cares enough to actually finish the project but if there was money on the line it is obvious how it would play out.


>it is obvious that it would do just as well at the full game excepting that OpenAI didn't have the stomach to try and tackle the engineering challenge for no reward.

That's an interesting interpretation. My interpretation is that they simply weren't able to do it yet (not for a lack of trying though), and so they gave up. It's not obvious at all to me that there is a way to simply scale their approach to the real game. I mean, certainly there is a way with infinite computing power, but it's not clear to me that even with all the computing power on earth currently this approach would work. 18 out of 117 heroes does reduce complexity quite a lot, especially when you pick ones with less complex mechanics.

>They weren't even trying to abuse the reaction time of the computer

They introduced a constant reaction time, but they still "abused" the interface they are provided. If this was a task in the real world, the AI would have the same interface as humans: The screen. They would have to move the screen and run object recognition algorithms on the rendered values, and they would have to "click" on units (not physically) instead of being able to internally select them. This would introduce completely different (and more realistic) reaction time parameters, because if a hero starts an animation it might not be immediately clear which animation it is in the first 5 frames, but the API will provide the value immediately.

AlphaStar's strength dropped considerably for example when they realized that the AI being able to perceive the entire map at once was an unfair advantage and restricted it so that it had to move the camera around. But still even AlphaStar is using an API instead of just the rendered output, which enables things like selecting units that are hidden behind other units, which for a human player is literally impossible without moving it out of the way first.

I think for AI to be able to claim that it is better at a game than humans, the AI ought to be playing the actual game, and use the same interface. (Not necessarily physically, you don't need a camera and a robot, but the AI's input should be the rendered output of the game, and the AI's output should be mouse movement / key strokes.)

You might think that would be a trivial and useless exercise (like it would be in Chess/Go), but I'm really not convinced of that for video games. There are a lot of examples where even playing a frame-by-frame recording of a fight you just can't tell what exactly is going on in a fight until a couple frames later, but the API will always provide floating-point accurate data of everything. This is a huge advantage that just doesn't translate into real world tasks when the AI has to use cameras and sensors to make sense of the world (in automated driving for example) just like we have to.


> 18 out of 117 heroes does reduce complexity quite a lot, especially when you pick ones with less complex mechanics.

Sure; but you can train one agent/hero so the training scales in a somewhat linear fashion once they get beyond ~15-20 heros. The computer won't be able to nail the full complexity of the game - but neither can a human, and the evidence from Go is the amount of the game a computer can grasp is much greater than a human.

Humans can't fully understand full DotA either; it is common for pros to be surprised by mechanical interactions or quirks on specific heros.

It might be an interesting theoretical challenge, but there would need to be some solid evidence of that beyond OpenAI calling it done and going home. Especially given the breakneck speed that the first 17 heros got trained. The AI was improving faster than a human does in MMR/day rates for specific heros. It looks like they got bored once the outcome became clear but needed a lot o work to achieve.

> If this was a task in the real world, the AI would have the same interface as humans: The screen.

This part of your argument is weak; you are relying on a computer having human limitations to show that it is superior. Computers don't have human limitations; that is a significant part of why humans can't compete with computers.

> I think for AI to be able to claim that it is better at a game than humans, the AI ought to be playing the actual game, and use the same interface.

What could you do if the bot people don't agree with you? When TeamDotaBot2000 is winning every game on the DotA ladder you can write them an angry all-chat message saying they didn't really win while your hero dead and their destroying your tree. That'd blend right in with the usual DotA crowd.

> You might think that would be a trivial and useless exercise (like it would be in Chess/Go), but I'm really not convinced of that for video games...

It isn't trivial; but it also isn't hard. It is time consuming and there are a lot of practical things that need to be solved. But if the resources are spent, it is expected that computers beat humans in any field with formally defined rules and objectives. That is a big deal.


>but you can train one agent/hero so the training scales in a somewhat linear fashion once they get beyond ~15-20 heros.

No, you can't. The problem isn't training the AI to actually play all the heroes, that's the easy part (and also it's not even required to play the full game, no human can play every single hero), the problem is training the AI to play against all the heroes. That's the challenge - and that's the part that doesn't scale linearly at all. The more heroes the opponent can pick, the more complex your "behavior tree" has to become to think about every possible combination of what they could do.

>you are relying on a computer having human limitations to show that it is superior.

No, I rely on a computer playing the same actual game and not cheat. Because that's pretty much exactly what they're doing right now. It's like modding the game so that every animation is color coded by if it would currently hit you, every minion perfectly marked if it is in execution range, every champion running around with circles that show their ability ranges, with your champion being color coded depending on which ranges, etc. (Except actually even worse than that.)

Again, in the actual real world an AI might have a 360 degree view of the environment, but it still has cameras and it still needs to do naturally imperfect object recognition algorithms on everything - it doesn't just get floating point values for every object's position and trajectory from God.

>When TeamDotaBot2000 is winning every game on the DotA ladder

If TeamDotaBot2000 wasn't directly affiliated with Valve, they would literally get banned for cheating. Because they're not interacting with the game like a normal player would, and they are doing things even a perfect, god-given AI playing the game through the same interface couldn't.

>it is expected that computers beat humans in any field with formally defined rules and objectives.

I expect this to be eventually true in any field. By so much that humans will be fundamentally useless in terms of advancing knowledge, culture, or anything in between. But if that's going to happen in 20, 50, 100 or 10,000 years really isn't clear to me, and neither is that all they'd have to do is put a realistic amount of hardware into their current approach to learn the entire game. The complexity of this task is definitely scaling exponentially, and just like you can't just throw twice the compute at a 128-bit cryptographic key because it worked so well for 64-bits, you can't just throw more compute at a game to learn to play against more champions.


I've seen projects in the pipeline for years, convinced by the considerable marketing, on spurious vague promises for even relatively basic requirements fail completely.

The kind of people attracted to this area is also a factor. The automation revolution is a marketing conceit


According to my friends in that business, this is already happening. Companies have bought into expensive garbage solutions under the name AI (my pet peeve is chatbots; so many companies got rich on that hype) for millions of $ and are getting slightly allergic to the AI thing.


> (my pet peeve is chatbots; so many companies got rich on that hype)

Yea billions went on chatbots that were no smarter than some very basic procedural logic and grepping a text file. Worked in an investment bank who put one in front of customer service and got rid of the phoneline. It was utterly infuriating. You'd log an issue (used to be done via a form) by answering questions to this bot. What did that achieve?


Was that chatbot LivePerson by any chance?


Think it might have been, yes. Familiar?


There was a company that got rid of its entire customer service staff after BERT came out. Someone convinced them that the Turing test was solved. I'm not sure what's their status now but I bet they brought humans back.


AI is the new blockchain. The only difference is that Blockchain now beginning to look ready for real world applications having gone through its wild west days. AI is still in that phase. Lots of talk, little to no substance. Yes, Google Maps, Translate, Siri and countless others are great examples of AI. But there's definitely a lot of snake oil being peddled through it. And unless we as an industry get past it, AI is not going to go anywhere as a tool


> The only difference is that Blockchain now beginning to look ready for real world applications having gone through its wild west days

Hold on for a moment, and name three. Because I honestly haven't seen a blockchain that isn't either a) scam, b) naive mistake, or c) an ecological disaster if ever scaled up (therefore unworkable).

AI, for all the hype and nonsense around it, isn't even approaching the blockchain levels of bullshit.


These are my three favorite use cases (which I use personally)

1. Bitcoin - I have hedged currency successfully using Bitcoin

2. Maker - Check it out. It really does show power of smart contracts

3. Oracles - Although, I am not familiar with the tech but they are beginning to show real world adoption. Maker uses Oracles for e.g. Check Chainlink

Edit: Telegram? I use it


Most "AI" is a scam but "blockchain" is far worse of an offender. "Blockchain" has zero utility in the business world. Its only use case is decentralized, censorship-resistant protocols. No one is trying to censor a shipping company's supply chain manifest. No one is trying to censor JP Morgan's bank transactions. The people who need it are dissidents, deplatformed people, people who cannot get bank accounts, or people trading contraband. Not exactly areas to extract an industry from.

Half of these altcoins do not have working software. In 2017 the ICOs did not have any software at all, just a promise on a homepage with a button to send money. The other half do not scale or do not meet their own basic requirements. The oldest, most venerable project Bitcoin had an inflation bug a couple years ago that would have let anyone counterfeit an unlimited amount of coins.

At least the AI scammers show up with code.


I loved all the hyped Blockchain companies 3 years ago where I would always think "what does this do that a small MySQL instance doesn't already do better for a real business?"

Hey - you can save your customer database in a really cool convoluted and slow database. Because bitcoin. Plus your clients can run this fun app that drains their batteries.


Bitcoin is the ecological desaster that OP talked about.

Smart contracts have the fundamental problem of people having to write perfect, error free code (as it can't be changed later), i.e. an impossibility. Often these smart contracts deal with a monetary value (as people strangely assign value to those "coins"), so something like the DAO hack on the Ethereum blockchain is inevitable. Also to do anything on a blockchain, even just add 1+1, on any hardware, no matter how expensive, no matter how far in the future, has to work at full capacity to hack some crypto. Your laptop will heat up and the fans go on full. Sounds trivial but if you've experienced it it's weird.

Oracles are a fundamental problem of blockchain: If you trust an outside source of truth, why need the blockchain? Go straight to the oracle.


In my estimation it is precisely the other way around. There are hardly any applications of blockchains that are not equally well solved by other methods, whereas computer vision alone has a huge number of applications and “neural network” based solutions are now undoubtedly the best approach to a huge number of hard computer vision problems.


Yes but computer vision is not what people mean when they are talking about AI. Many of the great AI advantages are available as cloud services. But that's not what people mean by AI. Or they hide those use cases and services. Focusing more on pie in the sky narratives. And that's okay if coming from Google/Facebook etc. But this is mostly management/consultancy speak now.


I’m glad you scoped it well computer vision is a somewhat narrow field and therefore say for a logistic company AI can be applied to some but not all businesses processes. It’s an improvement but not a revolution.


In my estimation it is precisely the other way around. There are hardly any applications of blockchains that are not equally well solved by other methods, whereas computer vision alone has a huge number of applications.


I have yet to find a use case for AI where I work. We just need simple business apps.


we serve a lot of companies that have actually a use for computer vision and nlp.


Machine learning is revolutionizing and democratizing machine perception at an alarming rate.


> Machine learning

> revolutionizing and democratizing

> perception

> alarming rate

Well done. How can I send you money?


That may sound like a flood of buzzwords but he's not wrong at all, except maybe for "alarming rate".

This tech is "revolutionary" in that nothing like it has ever performed anywhere near the current level. MNIST is effectively a solved, toy problem now, and many image classification problems are as well.

Democratized in the sense that you can for the first time write and test these state of the art algorithms on your personal machine and, while a couple years ago you needed something beefy like a 1080 TI, even that requirements has been relaxed with many optimized architectures.

More importantly, all of this has literally exploded in the last 5-10 years, no doubt due in part because of the open access nature of arxiv and open source.

That's totally fine though, while comparisons to blockchain are a little insulting, all of this cynicism means less competition for me!


Tech was there decades ago.


I wrote an article I published today about how AI is the biggest misnomer in tech history https://medium.com/@seibelj/the-artificial-intelligence-scam...

I wrote it to be tongue-in-cheek in a ranting style, but essentially "AI" businesses and the technology underpinning it are not the silver bullet the media and marketing hype has made it out to be. The linked a16z article shows how AI is the same story everywhere - enormous capital to get the data and engineers to automate, but even the "good" AI still gets it wrong much of the time, necessitating endless edge-cases, human intervention, and eventually it's a giant ball of poorly-understand and impossible to maintain pipelines that don't even provide a better result than a few humans with a spreadsheet.


Well said, the tech is not ready for what people are trying to use it for (for the most part). The problems people are trying to solve is still best served by clever algorithms + data.


This article really resonates with me: I'm currently working on bootstrapping an applied AI service and faced most of the challenges mentioned, to the point where I was doubting myself as I was not as efficient or productive as other Saas / Software founders. So this article is kind of reassuring to me.

A few more comments:

Cloud infra For a traditional web app you can quickly deploy it on a cheap AWS on Heroku VM for a few dollars/mo and later scale as you get more traffic. With AI you now need expensive training VMs. There are free options such as Google Collab but it doesn't scale for anything else than toy projects or prototyping. AWS entry point GPU instance (p2.xlarge) is at $0.900/hour i.e. $648/mo, and a more performant one (p3.2xlarge) at $2160/mo. Yes, you should shut them down when you are done with training but still. You can also use spot instances to reduce cost but it's not straightforward to set up.

For inference, you also need a VM with enough memory for your model to fit in, so again an expensive VM from day one.

Datasets if you rely on a publicly available dataset, chances are there are already 10 startups doing the same product. In order to have a somewhat unique and differentiated product, you need a way to acquire and label a private dataset.

Humans in the loop The labeling part is very tedious and costly both in terms of time and money. You can hire experts or do it yourself at great cost, or you can hire cheap outsourced labor who will deliver low-quality annotations that you will spend a lot of time controlling, filtering, sending back, etc.

For inference, depending on your domain, even with state-of-the-art performance you may end up with say 90% accuracy, ie 1 angry customer out of 10. that's probably not acceptable, but it gets worse: chances are you will attract early-adopter customers who are faced with hard cases, whose current solution doesn't work so that's why they want to use your fancy AI in the first place. In that context, your accuracy for this kind of customer might actually be much worse. So again you need significant human resources to control inferences in production. It will be hard to offer real-time results, so you may have to design your product to be async and introduce some delay in responses, which is maybe not the UX you initially had in mind.

I still think there are tremendous opportunities in applied AI products and services, but it's important to have these challenges in mind when planning a new product or startup.


How does it compare with building your own GPU-heavy computers? I'm not too familiar with it or how consumer-grade GPUs fare in these workloads, but training sounds like it in theory could happen locally easier than any consumer facing parts.


My greatest takeaway from fiddling around with machine learning for a past few years is this:

Most of the business use cases are deterministic in nature. It is really hard to shoe in a probabilistic response in such use cases, as illustrated in this article. It's even harder to convert this probabilistic response to a deterministic response (humans in the loop), at a cost comparable to just maintaining the deterministic system.

There are a few business cases that naturally are probabilistic - ad bidding, stock price prediction, recommendations. The giants have taken over this already.

I am still scratching my head to find a use case for a startup to jump in, use machine learning and solve a growing business use case. Probably there are some healthcare related use cases but I have no idea about that domain.


To me, the more interesting question with ML isn't "How can I build an entire business around a model," but rather "How can I exponentially increase a product's value with machine learning?"

For example, I don't theoretically need Netflix/Spotify/YouTube to have a great recommendation system. All I need them to do is stream media to my device. However, their recommendation systems add immense value to their product for me.

A similar example would be navigation services (Uber/Google Map's ETA prediction, for example). I don't necessarily need those apps to give me predictions about traffic patterns—their core functionality is just giving me directions—but it makes the product much more valuable to me.

For those businesses, this increase in value leads to an increase in usage and therefore revenue—it's not simply an abstract "nice to have."

I think oftentimes the argument about the value of ML comes down to a binary decision of whether or not ML can 100% replace humans in a given domain, and that this is a flawed model. ML is capable of improving most products, in my opinion, and we're already seeing it happen.


I would take it a step further. What you're talking about is incremental improvements to traditional services or products.

What the poster above you is getting at, whether they realize it or not, is that they are trying to identify markets where ML could exist solely as the service or product.

I don't think the latter is ever going to be possible for ML much like that same comment alludes too.

Maybe makes ML a bit less sexy, but also is probably true of any innovative technology.


Imagine QA of manufacturing production items. You can place cameras next to produced items and observe if they satisfy some quality requirements or sort them into buckets of certain quality levels. Now extend this to many other areas. It's cognitive automation basically, i.e. replacing humans with machines for tasks where humans need to think but which are repetitive.


Computer vision is commonly used already in industry for quality control. Most of the parameters can easily be controlled with deterministic algorithms e.g. my crisps (chips) aren't too burnt or too raw based on simple colour ranging, my biscuits are the correct shape etc.


ML potentially lets you do that without explicitly listing which failure states you are trying to detect.

I'm not convinced that is a good thing, because it just gives you an excuse to not understand your own manufacturing process and how it can go wrong.


I think this is a good observation. One thing I’ve been thinking about is probabilistically evaluating traditional software. Every un-caught exception, for example, is usually a false negative when evaluating “did the code correctly handle the input?”

Not that ML code is bug-free. But it seems like most traditional software is probabilistically correct, despite not being thought of that way.


I've revised the naivety out of their 3 points at the beginning of the article

--------

In particular, many AI companies have:

1. Lower gross margins due to heavy cloud infrastructure usage and ongoing human support ---> Their product is actually an army of expensive humans

2. Scaling challenges due to the thorny problem of edge cases ---> Their products don't actually work

3. Weaker defensive moats due to the commoditization of AI models and challenges with data network effects ---> No truely valuable technology or systems


It’s important to keep in mind that most of the AI “innovation”/hype is coming from a handful of companies looking to make money from cloud computing or whose moat is built on data. The good old story of making money in a gold rush by selling tools...

I would venture that the most hyped stuff (deep learning) doesn’t really work very well in practice, and the stuff that works reliably well is boring enough to barely register on the hype train as “AI”. To be fair, that’s probably an exaggerated caricature, but that is my reading between the lines of SOTA AI results.

Very few people (even among ML researchers) appreciate the fundamental limitations of associational reasoning (rather than causal reasoning). It’s going to be an interesting time when this message actually sinks in...


I really like your point of this being "associational learning". There is a lot more to intelligence than just learning associations


Hah too true. Amazon, Nvidia, Coursera, etc. are making most of the money and the "AI startups" along with their investors are paying it.


Another way I'd put it: "SaaS products are technically really quite easy. AI is actually hard".


It's relatively easy to put together a neural network that works as advertised with test data and sort-of works with a good amount of real world data. That's not the problem.

The actual problem is the basically what GP actually said; Making a working neural network doesn't mean it provides any value in the real world.


The biggest money in AI is starting online courses.


Or selling GPU instances by the hour


#1 isn’t saying that at all. Cloud bills are expensive for an early stage startup in the AI space. They later touch on the human angle in the post, but not fair to diminish their cloud bill point.

2. No, they don’t work in all cases. Think Tesla autopilot versus a Level 5 car without a steering wheel.

3. True. I wonder if more ai innovation will start happening behind close doors as a result?


Regarding #3, we still have so far to go to get to human-level intelligence that I think companies benefit more from open research than closing their doors. When a company publishes their research, others can find and publish improvements to their research. Then, the company can use the published improvements to improve their product.


Have you ever tried to reproduce research from academics and or companies ? Some papers take more than a day of work to reproduce . Not only that if you publish the research without the dataset it’s practically impossible or extremely expensive . Doing GPT-2 from scratch is around $50k or compute time and data collection.

Even a simple deep learning paper will require you to have at least a NVIDIA 1080ti if you are lucky and for NLP I needed to buy a RTX Titan ($2500 graphics card).


Some papers take more than a day of work to reproduce

A large part of my job is this. Generally reproducing a paper with no code is months of work.

Doing GPT-2 from scratch is around $50k or compute time and data collection.

It's extraordinarily rare you need to do this though. I know a few who have (mostly for foreign languages) and all have been able to access TPU grants from Google or multi-node GPU clusters (which are pretty easy to find if you work in the field - plenty of vendors want someone to test their "supercomputer" for a a week)

Even a simple deep learning paper will require you to have at least a NVIDIA 1080ti if you are lucky and for NLP I needed to buy a RTX Titan ($2500 graphics card

Most of my work is in NLP and I do a large amount of it on a 1070.


Yea GPT2 requires a bit more ram than on the 10 or 20 aeries TI cards .

My point I think was unclear was the issue of reproducing the paper from scratch (I guess the model is enough ?) not sure how those papers are peer reviewed unless they also send the dataset to the reviewers ?

I also was unclear was when I said reproducing I mean to just get a model that has already been pretrained. I agree with your points .


Peer review almost never means reproducing the model, but that isn't because of the dataset (which is usually available to the reviewer) but because that's not what a peer review is!

A peer review isn't an adversarial process where you think that the person has done something wrong. Instead, it's extra eyes on it to make sure they have thought of everything, and to say what additional tests might be needed and why.


This is a very interesting point! If this is true then it would mean that independent researchers will have a very tough time producing quality research without sufficient funding.


As they say, ML is written in python. AI is written in power point.


I mean you could write AI in power point ... https://www.youtube.com/watch?v=uNjxe8ShM-8

/s


Sort of like lies, damn lies and spreadsheets?


My favorite way of describing how AI is different:

  Traditional software: input + program = output
  AI:  input + output = program


I don't think AI is a business.

There are lots of business workflows that AI can improve or enable in the first place, and you might be able to make a business out of that, but you should be aware of the difference.

If you want to be in the business of enabling / improving business workflows, you should be very explicit about that, and AI could be one of many techniques in your tool belt -- but hopefully not the only one.

Starting an "AI business" sounds like "I've got a shiny new hammer, what nails can I hit with it?", which is kinda backwards from my understanding of how businesses should be run. It might work, but it's also easy to fall into the trap thinking AI should be your only tool.


I think the main difference between AI and traditional software is that the latter is actually useful in the real world.

The place where AI has had the most impact is image recognition—and that's cool, but that's a very small part of what we use computers for. As someone who spends almost every waking hour in front a computer, I don't think I use any software that employs AI for any useful purpose except maybe search engines—and I'm pretty sure that they were a lot better before they used AI. GPT-2 is neat, but there's literally no circumstance under which I would want to use GPT-2-generated text for anything except experiments or toys. And while I'm sure Google makes good use of AI for spying on people and recommending YouTube videos to turn my brother into a nazi, I'd much prefer if they didn't do that.


I worked at an AI startup that pioneered modeling over inbound leads and Sales outcomes (most of this data is in Salesforce + Marketo/Pardot/Hubspot) and that application was hugely valuable for a few companies with massive inbound volume. Zendesk and New Relic were the poster child success stories, since they didn’t have to massively scale their sales development teams to call on every mom and pop / student exploring a trial. It highlighted the 5-10% of their volume that could actually yield real money very effectively.

Unfortunately, there’s not that many companies with that problem, and the market never materialized to the necessary degree and they sold the existing contracts and pivoted. That said, it was a home run for what it did.

If you find the right problem, AI can create massive value. The hard part is never the learning, it’s finding and framing the right problem.


Deep learning is mostly focused on image recognition, and classification on tabular data is way more common for smaller businesses.

Your activity online triggers 10s of AI models for every hour. AI is just not much of a consumer technology, but it is what the producers increasingly use.

GPT-2 is useful way way beyond text generation, it has very powerful embeddings useful for many other tasks. The text generation and cat picture classification is what pulled it into the mainstream: how you know about it, while not familiar with the particulars.

Stop putting the sole blame on recommendation algorithms for radicalization. Or paint everything with the same brush and start blaming Youtube too for radical thoughts on climate change, veganism, identity politics, etc.

I suspect your brother is not even a Nazi anyway, just has differing views on immigrants and black crime statistics. Let's hope some other Youtuber does not get recommended a video on how to punch a Nazi, as by your reasoning, people could not help themselves.


I guess it depends on how narrow one defines "AI". It sounds like you are defining it as "Deep Learning"... (maybe the article did as well, I didn't read it).


Busted! How did you find out about your brother? I thought we were acting in complete secrecy ...


Interesting take. Someone who comes up with a good NLP AI though will probably be closer to 80% margins because you’ve got a shared model. Image processing too is usually built on top of pre-trained models and work surprisingly well. This is known as transfer learning. So I think we will see more online AI models.


I agree that cloud operating costs are lower for NLP AI, but for a different reason than you cite. I think they are lower because the type of data that they operate on is much more compact than AI that is operating on images, video, or audio.


"AI" products are consulting heavy as 80% of the effort is spent on data. Real world data inputs are messy, most need human curation up front.

Simple example: FMCG/Retail would like to use apps to scan shelves for their own product placement and their competitors. Simple, right? But getting those packshots, account for lighting and placement conditions, ... you need an army of people to do QC.

AI really just moves the analytics/BI load from a company to a vendor - hence it is consulting. Good business to have, but human labor always has shittier margins than software.


Isn't 99.99999...% of any currently used AI good ol' statistics or (not XOR) classical operations research stuff or (not XOR) well-known and rather simple NLP?


The more specific the use case, the more that human-engineered tools/rules will dominate. E.g. Regex beats BERT 99% of the time for any specific use case of value to a real customer, but BERT does a decent job in open domains without much effort.


Where is such 'decent job' part of the wonders touted as "AI", which widely successful product/service sold as AI-powered benefits from it?


No.

For one thing this ignores image recognition, which is probably the most successful field for neural network deployments.

And most modern NLP is "download all the written text on the entire internet, filter it for the good stuff (somehow), throw it into a neural network that is bigger than anything that has been successfully trained before, and wait a week before you can see if it worked". I wouldn't describe it as simple.


Sure, AI is "just" statistics in the same way that software is "just" math.


> Statistics?

As in David Freedman's books?

Uh, statistics tries to be applied probability with some assumptions, theorems, proofs, results, and applications. E.g., the assumptions for regression analysis that let us use and F-ratio and t-tests hypothesis testing and confidence intervals.

Does current AI with neural networks have such assumptions and results? Maybe with the test data could develop some such results. And AI seems to have and assumption that the training data and the test data are independent and identically distributed (i.i.d.) random variables. Uh, arguing i.i.d. for some enormous collections of data might be difficult? How 'bout some concept of approximately independent? M. Talagrand has some such in his "A New Look at Independence"?


The models have gotten really complex and advanced substantially in the last few years


>> "rather simple NLP"

I'd like to have some of whatever you're smoking. NLP is one of the hardest areas of science with many problems that will not be conclusively solved in my lifetime, both technical and scientific.


I didn't mean that the field is 'simple' (easy to tackle), but that most tools used on the field does not seem to me "intelligent" in any way. Which one is "intelligent" and why?


What tool would seem to you as intelligent? Can you formalize that in some way?


Something able to learn or to discover without relying on a 'brute force' (exhaustive exploration of the problem space) approach.

Neural networks are in a way pertinent, I reckon, however I'm not enthusiastic about the fact that using it forbids us to 'explain how it solves the problem'. It also apply, albeit somewhat to a lesser extent(?), to bayesians and such.


In Bayesian optimization, the next guess is chosen according to a non-brute force equation that takes all previous information into account.

Does it appear more intelligent than repeated slight shifts more or less toward the first order gradient?

The second seems to be more similar to how humans learn, but the former more similar to how we rationalize after learning.

Still, the first is still repeated, relatively simple applied math. I’m not sure either case jumps out as being qualitatively different in a meaningful way when it comes to intelligence.


Over the decades that computers kept getting smaller, there were lots of new businesses, services, and products that were created thanks to this process.

Now that we're in a phase where we're making our DNNs deeper and deeper, what are some new successful businesses, services, and products that are springing up as a consequence? I can think of Amazon Echo. Any others?


How about the ability to search for objects in your pictures on your phone (granted, this is not a business in itself, but rather a new feature on existing products), use an online translator to translate almost anything in any language (including idioms) using something like google translate or deepL, fun niche products that give us a glimpse into what could be possible in the not so far future like AI Dungeon ... and that's excluding SaaS businesses that sell solutions to other businesses directly, so as a consumer you wouldn't really hear about it. I've worked at two of those companies in the past.

Machine Learning techniques have been applied to numerous different fields. Right now, they're often more subsidizing than a product in itself. But they are out there. And while I agree that a lot of companies are just throwing out buzzwords, it's definitely not all snake oil.


>ability to search for objects in your pictures on your phone

This is pretty cool, can save a lot of time in some cases, and doesn't disappoint because object recognition (even by humans) is unreliable by nature.

>use an online translator to translate almost anything in any language (including idioms) using something like google translate

That's debatable at best. Google Translate at least is still horrible, to the point where it is actually pretty much completely useless, at translating Asian languages to English. For longer texts, even German to English or Spanish to English produces a lot of nonsense, although it is usually at least somewhat intelligible if you're familiar with the sort of mistakes that Google Translate tends to make.

Anecdotally, I don't use Google Translate any differently than I did 7 years ago. It's good for single words or expressions, it's okay for single sentences if you are aware of it's pitfalls, it's pretty useless for large articles unless all you want is getting a general idea of what the article is talking about.

>AI Dungeon

I mean everything we've seen from AI Dungeon was hilarious because of it's nonsensical flaws. If those flaws can somehow be removed in the future we'll be stuck with a version where every individual sentence kind of makes sense but there is no coherent connection or story line, i.e. a product that nobody would use.


You definitely have good points, but since the post I responded to gave Alexa as an initial example, I kind of took that as baseline.

I can't really argue with anecdotes, but I can ensure you that a whole lot has changed in Machine Translation in the past 7 years, starting with Google introducing Neural Machine Translation[1] to replace their statistical model, which would often behave in the way you described. That's why I specifically included idioms, which hadn't really been possible until then. It's not yet perfect, but it's crazy good at what it does and only getting better.

I know AI Dungeon was really wonky, but I also said it gives us a glimpse of what may be possible. Because the way it interprets natural language is really something else (granted, that's the underlying model, of course). It's really a product still entirely in its infancy. And I don't think AI Dungeon will take the world by storm much more than it has done thus far, but I could imagine countless applications for a similar but improved technology.

I don't know, OP was asking for products and services, and I gave some examples. Are they flawless? No. Is most technology flawless? No. Will there be growing market for somewhat imperfect AI-based applications? I surely think so. In the end, humans aren't flawless either.

[1] https://en.wikipedia.org/wiki/Google_Neural_Machine_Translat..., https://arxiv.org/pdf/1609.08144.pdf


I worked on the recommender and personalization stuff for a midsized fashion retailer. It took me about 2 years to figure out most of the work done by our ML & big data teammates was pure fiction.

I believe, but have no conceivable way to prove, that all of the easy wins have been won by the first movers.

It's weird. I have almost irrational optimism about the unbounded potential for deep learning, ML, big data, and so forth. Amazing, almost magical, stuff like KP figuring out that Vioxx was killing their patients once they had enough data and asked the right questions.

But I'm also completely skeptical about using these techniques for culture and entertainment and advertising. Methinks that ceiling has already been hit and there's nothing on the horizon.

Too bad. I really wanted our internal version of Stitchfix to actually work. Now I'm not sure it's possible. Or that we (or maybe just me) were asking the wrong questions.


Are AI startups just a business model built around some proprietary code/software?


Great article, so... the opportunities for innovators and investors are:

- new OS paradigm: skunkworks in a garage somewhere in the world

- new hardware paradigm: skunkworks in a garage somewhere in the world

- new AI business: quietly disrupting a niche of a niche


Just a side note, but please; on-premises, not on-premise.


Just a side note, but please, use a comma to separate two dependent and incomplete clauses, not a semicolon or colon.


data is the new oil!


My prediction is that the first, really huge, AI breakout company will be the company that can figure out how to deploy AI to automate and scale data normalization/processing/integration pipelines and the resultant downstream features that drive value.


Is this satire? Because I can't tell anymore


I expected to read the keyword 'webscale' at some point but was left disappointed ...


Ways to solve some of the issues mentioned:

1) Using containers, the SASS software can be downloaded to the client cluster and managed by Kubernetes operators. Hence the cost of training and storage will be bare by the clients themself and not by the SASS company.

2) Second, the use of AutoML should increase the productivity of the startup employees (especially with the ongoing retraining of models, deployment, monitoring, etc).

The one problem that will always be there is new data and edge cases in the data. I do believe that this would be the major obstacle for the next 5 years.

Also, I would expect to see the number of models actually explode (assuming that they are trained and deployed by AutoML). Case in point is Uber with models per city/ time of day (with 1000's of models in production).


This doesn't really work as most AI Modeling is very resource intense (GPU / TPU etc.) that most business aren't going to have.


This can be circumvented through the use of transfer learning and super convergence.

You can now train a State of the art image classifier with nothing more than a google colab notebook (free to use).

That’s for deep learning. For classical ML (I.e. XGboost) it really doesn’t take much time or compute to create a model that has business value


So I would argue that most businesses use tabular data that can be used to train classical models (XGB, etc), and reach the same performance (if not better) than deep learning.


> Case in point is Uber with models per city/ time of day (with 1000's of models in production).

That is a good case-in-point, because one of the arguments in the article is that AI is expensive.

At what point does 40,000 compute-hours and a few million dollars spent on hundreds of city models become a better use of time and money than an afternoon noodling with ARIMA or some Fourier analysis on a $5,000 workstation?

Perhaps -- perhaps -- at Uber's scale, squeaking out a tenth of a percent is worth the time and money. But the rest of us schmoes can do pretty well with an SQL query, some R or Python or Julia and a generous dollop of good old-fashioned all-American hubris.


I absolutely agree. I did not say 1000's of deep learning models.

There should always be a tradeoff between performance and cost.

But it is much better to give customer the option, instead of forcing her to the said SASS architecture.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: