Another way to have tackled this problem was to just go down the research route and build a product once the ML was fully baked, but VCs won't fund such adventures.
To get VC funding, we had to lie through our teeth, which was a total no-go. I am not bringing the righteous argument, but for once I feel really good that our viewpoints held then are corroborated now, some kind of confirmation bias kicking in.
What we need is a set of accounting metrics about the cost of training and the rate of improvement, so that VCs can get comfy with how to make projections rather than choosing between buying snake oil and nit participating in the AI Dev cycle.
PS - emphatically not taking aim here at your specific decision, OP. Startups have different opportunities and interests, and these decisions are tough. I'm sure you know better than I do whether the ethical wizard-of-oz model above is relevant to you in particular.
The main thesis is that there seem to be more and more companies out there solving interesting problems, which by itself is great, but they're bolting a lot of wording on top talking about AI. Most of this seems to be an attempt to differentiate themselves in the market and access funding and I find it incredibly dishonest.
The whole discussion around AI these days has become so tainted by scheisters trying to attract funding and attention that I actively try to distance myself from association.
We need to focus a lot more on Intelligence Amplification (IA) (https://en.wikipedia.org/wiki/Intelligence_amplification), i.e., building tools for humans to become more productive, and less on AI.
Douglas Engelbart and others had this figured out 50 years ago (see: the Mother of All Demos). The AI hype is dangerous, since it will no doubt lead to the trough of disillusionment.
I can't think of a single example of an "AI" company I talked with who did not know that they were explicitly exaggerating their claims and capabilities for the express purpose of attracting funding and fueling the hype fire.
The solid tech startups I've seen are focused on the problem they are solving for customers rather than the tooling (AI).
(I don't file expenses often, but personally when I do so through Expensify, I find SmartScan so slow that I now assume all my "SmartScanned" receipts are going to a human.)
>How to start an AI startup
>1. Hire a bunch of minimum wage humans to pretend to be AI pretending to be human
>2. Wait for AI to be invented
First, the prototyping/bootstrapping with humans is not a terrible idea. It takes a hurdle, and moves it down the line a bit. You still need to get over that hurdle. It's still a potential failure point, but it's not an irrational approach. There are a ton of businesses started on the basis of.
1. make free service
2. get 1bn customers
The deception is a problem though, building companies based on deception is a problem and will probably lead to a nasty crash.
Second, if the goal is a result rather than to use AI... Say you're trying to solve some specific transcription use case, like Spinvox or Expensify. First, the problem seems solvable or soon to be solvable. Second, it's helpful to have a stream transcription attempts coming to you with real life accuracy requirements. Third, it's not a bad way of approaching Human-AI hybrid systems. Start with 100% human labour and gradually hybridise. Maybe you never solve the hard problems that will get you to 100% AI, but you find ways of leveraging human labour to solve it efficiently.
If what they reach is a tech/process that allows a person to transcribe 10k words per hour... that's pretty good too.
Well, it might be. Are human drivers really prototyping/bootstrapping self-driving cars? We are deeply into the territory of Moravec’s Paradox here.
For a lot of NN/ML applications, they magic ingredient is "humans" and a record of humans doing something enough times to describe statistically. AI sign recognition, sentence completion or checker playing is very often based on estimates of "what would a human do."
"Would a human say this photo contains cats", is really how a lot of ML interprets the question "where is my cat"?
Moravec's Paradox is that everyone thought that sensory input would be easy and reasoning about those inputs would be hard. But it turns out that the sensory input part is very hard, much harder than anyone thought, and once you have that down, the reasoning is actually simple. So any service that is relying on humans doing the sensory input bit is handwaving away the difficult part. Humans aren't really aware of how much processing is involved in things we take for granted such as sight and hearing. We think that thinking is "special" but it requires relatively little power to do that once the underlying hardware/wetware has already processed the signals.
For example, a fundamental bit of logic is
Yes and no. If you ask a human, why do you think this is a cat, he or she will say, well look at the whiskers and the ears and the paws and the general overall cuteness. But an NN doesn't have a whisker coefficient or a cuteness multiplier. It doesn't "think" like a human and nor should we expect it to. ML that is more human-like is a decision tree, but once you start getting into random decision jungles and the like, the explainability starts to diminish.
The point about the paradox is good one, and very relevant to this issue. I can't tell if it is a paradox about computers/intelligence or a point about people. I suppose it's all the same when it comes to building wizard of Oz companies. because Moravec's Paradox, you will probably misidentify what is the easy and/or hard part of the problem you are trying to solve with AI.
Sadly most of these startups could do much better by just hiring and using human staff to provide the same services.
The business model "use low-paid labor to service wealthy clients" seems a little more inherently stable than that. In most cases, nobody expects a collapse. Why in speech-to-text?
This is a new take on the "value added reseller" business model. (I don't know what else to call it. Multi-level marketing?)
For instance, Autodesk's dealer channel covered the costs of marketing, sales, technical support, training, etc. While they kept the sweet nectar of profits for themselves.
With an internet based service, if you allow new people to sign up at any time and you become popular and you are not able to scale then you will not be able to service all of your users quickly enough. The people that you have pretending to be AI will be a bottleneck that makes it difficult to scale because you won’t be able to find more people suitable for that job and train them for it quickly enough. Then the business collapses.
How to get rich: (A) Start a page that applies photoshop effects to user-uploaded pictures. (B) Secretly accomplish A using low-paid grunts. (C) Claim it's done with AI, obtain a billion dollars of VC money
Im by no means a machine learning expert, but sometimes on a Sunday i'll pour myself a mug of coffee and go through some neat tutorial, and sometimes the stuff is Ubers.
They are by no means free from criticism for their wrongs, but to ignore what they have done correctly paints the world far too simply.
But I believe that concern about privacy is the main issue. If you trust some firm with your data, you also trust their data systems. And their staff. But if they're giving your data to numerous third parties, there's far more potential for leaks and malicious activity.
This reminds me of NSA's argument that data collection and processing isn't illegal, because ...
> According to USSID 18, a top-secret NSA manual of definitions and legal directives, an "intercept" only occurs when the database is queried — when someone actually reads the text on a screen.
- Edison Software: Their complaint is that engineers went some users' personal emails to "improve" a feature, and that this was a privacy violation. This is not addressing what I was asking. To me this means their AI really is doing the work by default, but that in cases where the AI doesn't behave well, they have humans intervene. Moreover, it's not clear to me that they ever claimed the work was fully AI-driven in the first place.
- Spinvox: Again, same thing: "The ratio of humans to messages and humans to number of users is very, very low."
- X.ai: I can't view the link behind a paywall so I don't know what their case was.
- Clara: same as X.ai
- Expensify: Again, same as Spinvox and Edison Software: "Expensify admitted that it had been using humans to transcribe at least some of the receipts it claimed to process using its “smartscan technology”."
- Facebook: Did Facebook actually claim M was entirely AI-powered? The fact that humans were used isn't enough to answer my question of whether they were "claiming to use AI but instead actually using human workers".
Note that I'm NOT defending these practices by any means, or suggesting that they're not privacy violations. All I'm saying is they don't answer my question, which was entirely about the use of AI, not about privacy violations.
To hopefully clarify what I was looking for: I was looking for examples of companies that are explicitly advertising that their services are AI-powered, but actually using humans to do the heavy-lifting. Meaning I'm looking for a company that does much more than have humans go through user data to improve its existing AI or handle edge-cases. That humans sometimes occasionally become involved at some points to improve service is not enough -- the company needs to be more or less pretty-much-actively-lying about their use of AI when the work is being delegated to human workers. (I asked for examples of this because this seemed to be the criterion the parent comment had.)
I'm not saying that silicon/mechanical intelligence isn't possible. I'm unaware of any physical law that precludes it. But what we currently call "AI" is just the pathetic fallacy run wild.
All that said, multidimensional data-driven linear recognizers are pretty impressive.
Define AI first.
One of the first few lines on Wikipedia about AI:
The scope of AI is disputed: as machines become increasingly capable, tasks considered as requiring "intelligence" are often removed from the definition, a phenomenon known as the AI effect, leading to the quip, "AI is whatever hasn't been done yet."
Consider speech recognition. Obviously an AI problem solved, right? Except, the AI problem and what the current solutions are solving are two different things. The AI problem is understanding and appropriately reacting to spoken words, as if a human was on the other end. Current implementations are glorified pattern matchers run over clever hashes of sound recordings. There is no understanding happening, there are no concepts forming within the machine (and the understanding is not back-fed into pattern matcher to correct the sensory input on the fly). The difficult parts, the ones that make speech an AI problem, have been entirely sidestepped with mathematical tricks. It sort of works, but its scope is nowhere near the original AI problem.
Similar analysis can be made for anything that is mentioned with the "no longer AI" quip. Can a DNN recognize a hot dog? Sort of, for some definition of "hot dog", only if input images are clear and similar enough to the training set. It's a cute trick, and you can make a business out of it if you control enough of the environment around the system, but it's still nowhere near what we mean when thinking about AI recognizing objects.
If you can't replicate what a human do, it's not AI. The fact that we can only beat humans for very, very narrow applications/games and that we don't have a generalized model for learning is a clear failure of the AI hype.
Note that machine learning may be popular today, thanks to the recent successes of deep learning but learning is not the only thing that makes humans intelligence. For instance, there is reasoning about what you know, and possibly other stuff like motivation, etc.
Honestly, that's also the problem with most articles about AI. Articles about AI are either praising the great mystical AI for recognizing cats or blaming AI for not achieving X, but humans can.
But just keep in mind, when most people talk about AI, they're knowingly talking of something much more limited. So you're going to constantly have communication failures with people who are defining AI differently than you.
(Many people nowadays use the term AGI [Artificial General Intelligence] to mean what you think of as AI, btw).
I should also point out that literally all the classifiers "we have nowadays" as per your comment, have been known for at least 20 years (including deep neural networks).
I'm pointing all this out because your comment suggests to me that your knowledge of AI and machine learning in particular is very recent and goes as far as perhaps the last five or six years, when deep nets popularised the field.
If that is so- please consider reading up on the history of AI. It is an interesting field that goes back several decades and has had many impressive successes (and some resounding failures) that predate deep learning by many years. I recommend the classic AI textbook "AI- A modern Approach" by Stuart Russel and Peter Norvig. You'll notice there that, even in recent versions, machine learning is a tiny part of the material covered. Because there is so much more to AI than just deep neural networks, or statistical classifiers.
If I'm wrong, on the other hand, and you already have a broad knowledge of the field, then I apologise for assuming too much.
Intelligence is a spectrum, some of the things we have created are intelligent by the definition of the word. They are not human level intelligence, yet.
For example, the machine becomes a master chess player then uses this ability to become a master at backgammon.
So our expectations might bee too high against AI getting vastly better only by doing better transfer learning.
As an aside- you might argue that there are common elements of board games' design that make it easier to reuse knowledge. However, statistical machine learning systems can still not transfer knowledge of one game to others, even given the common design elements that should make it easier.
Let's see how the machine learns about the concept of transfer learning in the first place :)
Data science: observing trends in data
Machine learning: humans develop models that are fit to data to make predictions
AI: Computers make modeling decisions entirely autonomously. No human input. Dump data, get predictions.
See this Wikipedia disambiguation page: https://en.wikipedia.org/wiki/Von_Neumann_machine
I have a feeling your statement will not stand the test of time well at all.
"...Expert systems are really slightly dumb systems that exploit the speed and cheapness of computer chips...There are many expert systems in the literature which are nothing more than a series of fast if-then-else rules...you do that a couple of hundred thousand times it can look remarkably intelligent"
The interview above was broadcast in...1984. There have been advances since then of course, but at the same time this 30-year-old quote still has a certain ring of truth to it.
Here's the clip from the TV programme featuring the quote above:
Not sure if this one has been debunked yet, but I remember learning in my psych class about the guy who fell asleep, got out of bed, drove a car a bunch of miles, and at his destination murdered his in-laws, all while apparently asleep.
It was used as an example for conscious/unconscious behavior.
As mentioned in the quote, this is usually done though if/else "if this is the case, then the next step is that".
The reason they are called Expert Systems is that they provably provide the correct output for any input 100% of the time.
What you actually see is that improvements in perceived machine intelligence show diminishing returns to increasing compute capacity which is a good sign that people are on the wrong track to achieve general ai and that future improvements in perceived intelligence will grow at a slower rate, not exponentially increase.
AI is huge. Machine learning definitively is changing how people work and will work. It is not there yet but it'll get there. Some people just want to monetize on the hype.
I don't understand your point. Please explain.
I guess they're meant to be some kind of arrow?
This is not really a healthy state for the economy, but seems to be how every technology wave happens. The real innovation will come when the cheap money dries up.
If you don't cut corners in the way you do the manual tasks and you charge enough to cover your costs, and enough tasks can be automated so that using your service is not more expensive than not using your service, you're ok. But you probably don't have the margins the VCs want.
Yes, quietly means "without making noise". I don't know why geekwire is so breathless about using English language properly.
I went searching to see if someone had written about the phenomenon last year, but eventually concluded I was probably mad. Thanks!
It never fails to get a good laugh.
Anyway, I think that human interaction for the training aspects of AI to prepare data, label examples, test models, etc. is really hard to automate entirely and should be considered part of the development process. The execution side of an application component that is marketed as AI/cognitive however is not true AI unless it is totally free of human interaction.
Otherwise, it's all bovine manure.
In companies that I have worked for that want to use AI; they always have a clause in the privacy agreement allowing employees to look at users' content in order to determine how well the algorithm was doing.
There are some differences: 1. PII was redacted (to the extent possible); 2. Customer could opt-out.; 3. the AI wasn't core to the service.
That said - sci-fi AI does seem to require humans behind the curtain.
Personally, I am more comfortable with a human being my virtual "AI" assistant than I am with real AI. I feel a human is less likely to make a mistake that will cause me personal damage, there is more responsibility/liability to doing a good job.
Does this include Uber's human drivers?
I see people going to work 16 hours a day handling transferring these transactions related to haircuts for AI bosses that don't physically exist (and therefore don't need haircuts), from one database to the other. The AI boss doesn't see the need for 8h workdays, because he's not human. And the money for the work doesn't need to be enough to afford living, right? The AI government will pay a base income to anybody anyways.
When we have transfer learning or one-shot learning, it will be a different story.
Same applies to my voice recordings. Same applies to my receipts. Same applies to my health data.
If you say you're taking people from A to B in a motor-powered cab but you actually come to pick them up in a rickshaw, that's wrong. People will naturally form expectations about the ride based on what you say.
It works well. And considering how much effect HN has on our daily lives, it’s neat that the algorithm reduces to “this human has good taste.”
I see no reason why HN won’t be around for decades. And that’s exciting. It’s the only newspaper that feels like a community.
This used to feel strange — if you think about the position of influence and power HN commands, it’s hard to feel ok about ceding control to a handful of people, no matter how benevolent. And on certain topics this has indeed been an issue —- if a certain behavior or conversation isn’t tolerated on HN, it’s easy to feel like you’re a misfit who doesn’t belong in tech, or even that you don’t identify with the tech community.
One way to become comfortable with this situation is to trust incentives, not people. The only way HN wins is if HN stays fascinating. It’s why most of us keep coming back: to monitor the pulse of the tech scene, or to learn physics factoids, or to spot a new tool that saves you hours.
And that’s why HN can’t be fully automated. Which stories fascinate you? If you could write an algorithm to generate an endless stream of interesting content, you’d have invented an AI with good taste. And for the moment, that’s beyond our capabilities.
 As the earth orbits the sun, the area swept out by the triangle formed by the sun->earth->earth3WeeksLater is equal to the area of the triangle during the next three weeks, and the next three weeks after that, and so on. Equal time periods = equal areas. That’s why the earth moves faster when it’s closer to the sun: it has to cover a greater distance in order to sweep out the same area as during the previous 3 weeks. This gives you an intuition about how gravity behaves. And with a slight tweak it also holds true for e.g. an ice skater twirling around. If you stick your leg out while spinning, your leg sweeps out more area than when it’s near your body. That’s why you spin more slowly: if you’re sweeping out more area per time period, your rotation must slow down proportionally. (This is conservation of angular momentum, dressed up in intuitive clothing.)
 This is in contrast with sites like YouTube, which give an endless stream of interesting videos. Part of HN’s strength is it’s unified front page. We all see the same thing. And that’s why writing an algorithm to make the front page interesting is much harder than creating a personalized neural network trained to show you all the interesting things you haven’t seen yet.
YT has a problem with trolls and harassers posting reaction videos that YT gets tricked into thinking are "similar".
Of course, the privacy concerns are there, but then again, if it's a real "AI", then it may be worse for the computer to read your data than for a random low paid worker ;)