Hacker News new | past | comments | ask | show | jobs | submit login
The Fall of Babylon is a warning to AI unicorns (wired.com)
70 points by sendandthrow 7 months ago | hide | past | favorite | 44 comments



I was hoping they were going to somehow draw parallels from ancient (Mesopotamian) Babylon. Sadly, it's some apparently overhyped health startup, which isn't nearly as interesting, although probably a more relevant comparison.


Babylon Berlin (seasons 1 and 2) were great.

https://en.wikipedia.org/wiki/Babylon_Berlin

This show draws some parallels if that's what you seek.


Why aren't there any unicorns named after Sodom and Gomorrah?


Mostly because the similarity is implied.


This is why I came here and was sorely disappointed.


Could be worse, i came here to make snarky comments about how maybe having profits that don't align with their valuation should be enough warning to ai unicorns, but then found out that's what the article was about... So sad i cant make the comments......


There's a difference between a company that overextends itself and fails to live up to its own promises, and the current explosion of LLMs. Are there some snake-oily VC pitches that over-promise what LLMs can do? For sure. The big distinction is that LLMs already have a whole bunch of things they're great at while Babylon was never really able to perform even one of it's promised features very well.


> The big distinction is that LLMs already have a whole bunch of things they're great at

I would say mediocre not great, and it is up to market to decide if quality threshold is crossed. So far there is no reliable data that some LLM startup started making good money and clients didn't leave after 1 year of mediocre results.


I feel like LLMs won't find their success in this "here's a LLM product!" kind of way but more by being leveraged internally by companies to increase velocity, do more with fewer employees, etc. I've certainly gained a lot from LLMs just by getting help on my own projects.


The article suggests they really didn't make any ml based approach where they take in huge numbers of case mates to train a system, so I think it can't be used to say anything about that f that approach works.


What's the warning exactly? I'm pretty sure absolutely everyone involved in running that company made loads and loads of money all along the way.


Someone should see if GPT can summarize the main points but it's the usual startup problems, growing too fast and failing to deliver on the company's main value proposition because of employee turnover and bad hiring practices.


Yeah I guess ultimately I'm just saying even "failed" startups are generally a huge financial/career boon to everyone involved, so long as it was just a business failure. The only real loser here is the VC.


The British NIH also was a big loser. It paid a lot of money for something that didn't work, and they lost out on other opportunities that may have panned out better.


I've yet to see a convincing new business model for LLMs. They are impressive, don't get me wrong, but right now they only seem to augment many already existing businesses or workflows in one way or another.

Unless that changes, "A.I." will be a commodity that you purchase like you already do with storage. Hardly exciting, tbh.


The fact that that company even made it so far is quite of a mystery to me. The tech stack, the complete lack of understanding of how things worked in the real world, the occasional illiterate product manager, and many many other things made me quickly understand that it will go belly up in a couple of years.

Fast forward to today and i was absolutely right.


It's such a shame. From a user point of view they were great. I'm registered with them. The UK GP practice has been taken over by someone else (eMed)

The app was great. I got appointments fast until very recently.

The triage was basic, though, but to an end-user it wouldn't have been obvious to what extent their business plan relied on that for the shot at future profitability.

I used to work for a VC, and we invested in PushDoctor, a competitor of theirs which also failed, and I remember peoples slight panic at Babylons "AI" and how I was puzzled at that because there clearly was no AI involved. To me as a user, the proposition was a great booking experience and flexible video bookings.

I hope eMed manages to keep the good bits running, because I really don't want to go back to my old GP service.

EDIT: Actually, I'd be fine with going back to my old GP if they used Babylons booking mechanism and offered video consultations, and maybe used it to provide overflow when they have capacity issues. Nothing wrong with the doctors. This was sort of where PushDoctor was headed when they failed - providing a platform - and it was a real shame.


This comment is so much more informative than the article. They made a medical UX that people actually wanted to use but oversold the AI. I wonder why nobody tried to merge the front-end with a different AI, or did that fail too?


I think the big issue here is that from a user point of view the "AI" is mostly uninteresting unless it is really amazing and as long as it's not really bad.

As a user I want to resolve my medical issue.

If the AI is really bad, it would become a nuisance and force me to go through annoying and poorly thought through questions to get an appointment, and/or make me answer "wrong" to get to where I want to be. Babylons "AI" had few enough questions it didn't feel like a barrier, so at most it was a minor nuisance.

Had it been really good, maybe I'd have cared, but what does "really good" look like here? If what I have is ambiguous, or need tests, I still need an appointment. From a user point of view there is no happy case there, even if they get it a point where it saves them money, because it doesn't save me money, and I can't tell if it's saved me time, e.g. by routing me to a person better able to help.

So the AI was sold to investors as a means to cut costs by being able to take on more patients per expensive human resource, be it doctors or nurses or others. But you need to make really good decisions so that you don't end up with extra referrals down the line to compensate before you save enough doctors time to fund the software developers and project managers etc. to build that AI.

And it seems like they basically got drunk on the valuations mentioning AI got them, took all they money they could get, and then failed to exercise any reasonable cost control when they should have realised that even if they had something really good, the proportion of revenue per patient they could shunt towards development on a per patient basis is small.

E.g. in the UK, the NHS pays an average of around 160 pounds / $195 per patient per year to GPs to cover everything. The margin you can extract from that after paying doctors and general practice costs doesn't fund a very large development operation until you get to a very significant scale.


There seem to be a lot of companies like Babylon around these days.

Many are full of hot air and... actually, that's it. They're mostly hot air. Maybe they think their hot air smells a little different?

If the IPO markets don't open soon, and VC firms remain cautious for much longer, making it difficult for these companies to tap into a new supply of fresh air, which they urgently need to stay afloat, we'll see more of them dwindle to nothing.



never even heard of this company until now. maybe the lesson is to stick with only the big companies, like Nvidia or Microsoft. Any fly-by-night company can call itself an AI company.


Good! They are the worst private health care I ever had. Public health care was better in the UK.


Damn, I'm going to have to actually go to a doctor now.


The UK part at least has been taken over by eMed. Might well be much of it will be kept operational, sans the "AI triage" which to me never seemed like more than a basic decision tree.


Ah interesting, thanks. Agree 100% on the "AI triage" too.


I'd like a word for what is now the ritualized 'weirdo CEO anecdote' as a hook into why a company failed trope. Make no mistake: if Babylon were currently worth $20bn, Antigua standups and calling employees Babylonians would be the subject of business-school case studies.

That said, this is a puff piece in Wired for so many reasons, not least of which is the question of whether or not app-based diagnosis and triage is a good idea. It seems to me almost certainly the future, and while Wired notes that a UK physician thinks Babylon sucked at this, it makes no effort to review any recent literature on AI capabilities. My limited understanding is that AI capabilities for triage and diagnosis are pretty hopeful right now.

I guess: "IPOed too early, didn't preserve cash, got caught in the end of zero interest rate investing" is too short to deserve a full article.


If you go to any startup founders gathering, I typically wonder "I am usually the biggest weirdo in the room. But in this crowd I feel among the most normal people ".

And it's still true for the later stage startups as well, but indeed it's the most prominent at the seed stage. There's really just a lot of weirdos among startup founders.


I think it was IBM that demonstrated in the 80s that a decision tree out performed doctors at diagnosis.


First line diagnosis is one of the hardest things we expect doctors to do. It's unlikely to fund a unicorn though, because it's also one of the least well paid activities for a doctor.


Babylons was really basic and what they effectively mostly seemed to do was really basic.

Obvious shunt the really urgent stuff off to A&E without talking to a doctor first, and nudge less serious things to an Advanced Nurse Practitioner, who would be cheaper for them.

But I imagine there's a tricky tradeoff there where every ANP consultation that leads to a referral back to a doctor because the ANP couldn't address the issue ends up costing them their savings from multiple ANP referrals.


Good value for patients who can't find/afford a doctor, then.


Given what inputs? Doctors don’t seem to be very competent at diagnosing things without a bunch of very expensive tests for unrelated things first and even then it’s just weeding out what it isn’t rather than accurately predicting what disease a person has.


I don't know if IBM created anything in this area back then, but the MYCIN expert system was developed in the 1970s at Stanford. MYCIN was shown to outperform infectious disease experts by 1979 in a blind testing:

>... Eight independent evaluators with special expertise in the management of meningitis compared MYCIN's choice of antimicrobials with the choices of nine human prescribers for ten test cases of meningitis. MYCIN received an acceptability rating of 65% by the evaluators; the corresponding ratings for acceptability of the regimen prescribed by the five faculty specialists ranged from 42.5% to 62.5%. The system never failed to cover a treatable pathogen while demonstrating efficiency in minimizing the number of antimicrobials prescribed.

https://jamanetwork.com/journals/jama/article-abstract/36660...

https://en.wikipedia.org/wiki/Mycin


I wish I had the link to the YouTube video where a researcher described one of the main reasons why MYCIN failed. Because when you optimize for number of patients per doctor, it is a lot more cost effective to prescribe a broad spectrum antibiotic that kills all pathogens and move on to the next patient than it is to spend time with one patient and MYCIN to narrow down to the exact pathogen.


And that worked out well! When you optimize for reducing costs, not outcomes, the outcome you get is antibiotic resistant bacteria!

Seriously though, if we take a "maximize outcomes" approach, then doctors don't actually need to spend any time with a patient - the computer can do it. Maybe this isn't optimal for wealthy first-worlders, but it will be better than what poor first-worlders and everybody else have now. Many people in the US simply do not go to the doctor, unless it's an unavoidable ER trip.


My Apple ][ came with a BONE TUMOR DIAGNOSIS program written in Applesoft on the contributed software floppies from Apple, "COPYRIGHT 1978 APPLE COMPUTER, INC." You might consider it Apple's earliest AI Powered Health App! You can run it in an emulator here:

https://archive.org/details/a2_Biology_19xx_

    Text found in Biology_19xx__.do/BONE TUMOR DIAGNOSIS.bas:

    0 TEXT : GOTO 2000
    19 DIM PB(9,19)
    20 FOR I = 1 TO 9: FOR J = 0 TO 18: READ PB(I,J): NEXT : NEXT
    25 DATA 15,20,35,45,20,80,99,1,100,20,40,60,1,0,15 ,35,50,0,0
    26 DATA 5,75,20,5,20,80,100,50,75,0,90,10,0 ,30,50,35,15,0,0
    27 DATA 3,50,35,15,30,70,30,20,100,25,85,15,0 ,2,85,15,1,0,0
    28 DATA 17,25,25,50,35,65,40,1,85,65,20,80,5,65,15,20,25,25,15
    29 DATA 10,20,20,60,5,95,55,0 ,90,65,20,80,25,2,0 ,10,40,30,20
    30 DATA 25,65,25,10,10,90,30,05,95,75,15,85,98,05,0, 0,10,30,60
    31 DATA 5,20,35,45,0,100,30,1,100,50,25,75,100,5,15,25,55,5,0
    32 DATA 15,70,25,5,35,65,20,5,85,90,15,85,0,0,0,5,10,20,65
    33 DATA 5,10,25,65,20,80,50,1,85,80,15,85,0,0,0,0,20,30,50
    49 FOR I = 1 TO 10: READ Z$(I): NEXT I
    50 DATA I-A,I-B,I-C,II,III,"95-100%","40-95%","25-60%","15-35%","4-15%"
    55 IF NOT Q THEN NEW
    60 DATA GIANT CELL TUMOR,CHONDROBLASTOMA,CHONDROMYXOID FIBROMA,CHONDROSARCOMA
    62 DATA FIBROSARCOMA,OSTEOSARCOMA,PAROSTEAL SARCOMA ,EWING'S TUMOR,RETICULUM CELL
    70 FOR I = 1 TO 9: READ DIAGNOSIS$(I): NEXT
    80 FOR I = 1 TO 9:PB(I,19) = 1: NEXT
    90 HOME : PRINT " THIS PROGRAM USES A PRIOR PROBABILITY"
    92 PRINT "MATRIX PUBLISHED BY G.S.LODWICK M.D. IN"
    94 PRINT "THE RADIOLOGIC CLINICS OF NORTH AMERICA"
    96 PRINT "IN 1963."
    98 PRINT : PRINT "THE PROBABILITY OF THE DIAGNOSIS IS"
    99 PRINT "CALCULATED USING BAYES' RULE FOR THE"
    100 PRINT "PROBABILITY OF CAUSES"
    102 PRINT : PRINT "NINE DIAGNOSES ARE CONSIDERED"
    105 FOR I = 1 TO 9: PRINT I; TAB( 4);DIAGNOSIS$(I): NEXT
    106 PRINT : PRINT "THIS PROGRAM MAY NOT GIVE THE CORRECT DIAGNOSIS. FINAL DIAGNOSIS CAN BE OBTAINED ONLY BY BIOPSY"
    110 PRINT " HIT ANY KEY TO CONTINUE";: POKE -16368,0: GET A$
[...]

    2000 HOME : VTAB 5: PRINT " COPYRIGHT 1978 APPLE COMPUTER, INC."
    2010 FOR K = 1 TO 2500: NEXT K
    2050 VTAB 12
    2060 PRINT "THIS PROGRAM IS FOR PHYSICIANS ONLY!"
    2065 PRINT : PRINT
    2070 PRINT "IT IS DESIGNED TO BE USED AS A SINGLE"
    2080 PRINT "TOOL WHICH, ALONG WITH MANY, MANY":Q = 1
    2090 PRINT "OTHERS, CAN LEAD A QUALIFIED PHYSICIAN"
    2100 PRINT "TO A PROPER DIAGNOSIS."
    2105 PRINT
    2110 PRINT "ANY USE BY A NON-QUALIFIED PERSON TO"
    2120 PRINT "DIAGNOSE ANY CONDITION WOULD BE"
    2130 PRINT "GROSSLY MIS-LEADING AND DANGEROUS."
    2150 FOR K = 1 TO 18000: NEXT K
    2160 GOTO 19
    3000 REM BASED ON A PROGRAM BY JEFFREY DACH M.D.
    3010 REM 909 ROSCOE CHICAGO,ILL.60657
    3020 REM COPYRIGHT 1978 APPLE COMPUTER, INC.


What -- are you telling me that WIRED MAGAZINE does PUFF PIECES??!

You've Got Smell!

https://www.wired.com/1999/11/digiscent/

>But the orange peel is just the beginning, a humble hors d'oeuvre. Marc Canter, sometimes known as the father of multimedia, joins us to serve the multiscented main course.

With Marc Canter as boastful spokesman, biggest weirdo, proud maître d', and influencer under the influence of the munchies, it's more of a Puff Puff Pass piece.

>"You know, I don't think the transition from wood smoke to bananas worked very well." -Marc Canter

https://news.ycombinator.com/item?id=29225777


Babylons triage was a set of very basic questions. The problem here is there is only so much it helps you. They would try to use it to shunt you to Advanced Nurse Practitioners instead of GPs. Which is fine - I'd often pick one myself if I knew that'd likely be enough.

But it'll be rare they'd be able to safely tell people they don't need an appointment, and I'd have to imagine it doesn't take many ANP appointments leading to a referral to a GP before their savings from referring to a nurse first disappears.

I'm sure it'll eventually pay off, but I'm not sure just how many appointments they can avoid that way.


I've found that VCs in the past have not ever really inquired much about a team's technical ability to get their project done, but just about the business and competitive aspects. They just assume that if you're talking to them, you know what you're doing. This probably made a lot sense in the 2010's, because most startups were pretty basic web or mobile apps, and 99% of teams are going to fail because of lack of adoption.

But now that the low hanging fruit has been plucked, and a new generation of hucksters has learned to take advantage of investor's lack of technical due diligence, they may need to start being more diligent.


People tend to believe what they do is the important part and everything else is fungible. From my observations VCs tend to think getting funded is what matters and doing the technical work is just a function of spending that money on people.


There is also an assumption I have found, in both VCs and tech founders, to assume that subject matter experts can either be trivially replicated or hired downstream.


”Maybe Enki knew that also," Hiro says. "Maybe the nam-shub of Enki wasn't such a bad thing. Maybe Babel was the best thing that ever happened to us.”


The lesson being, watch out for Hittites?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: