Hacker News new | past | comments | ask | show | jobs | submit login

A close friend of mine went down the ChatGPT medical rabbit hole last year. He was disagreeing with his doctors and getting deep into supplements and unproven medical theories.

ChatGPT was instrumental in convincing him that he was correct and his doctors were wrong. He would post his ChatGPT conversations as proof, but we could see that his prompts were becoming obvious leading questions.

He would phrase things like “Is it possible for {symptoms} to be caused by {condition} and could it be treated by {herbal product}?” Then ChatGPT would give him a wall of text saying that it’s possible, which he took as confirmation of being correct.

It was scary to see how much ChatGPT turned into a mirror for what he wanted to be told. He got very good at getting the answers he wanted. He could dismiss answers that disagreed as being hallucinations or being artifacts of an overly protective system. With enough repeat questioning and adjusting his prompts he could get it to say almost whatever he wanted to hear.

ChatGPT is rocket fuel for hypochondriacs. Makes WebMD look tame.




FYI, this is the same as people doing web searches. You’ll always find a place that agrees with you or says it’s possible.


Not quite. For people outside of tech they can hear "AI" and think it must be right. And how great GPT is so it must be right. There are no other opinions.

Distilled down, classic search is many results; gpt is one result.


Someone in tech debated me on the term fossil fuel.

He said that oil is renewable because the platonic plates make oil and not some million old waiting based chemical process.

When people ignore everything and also ignore logic (why do oil wells disappear after getting the oil out) it really doesn't matter were they read it


Platonic plates?


Yes, the Form of a slightly concave disc for your dinner. Your kitchen cabinets contain only imperfect reflections of the True Plate.


Tectonic plates.

Not sure why I wrote platonic


Funny you mention this. There was a time (in my living memory), where people regarded whatever in print as proof of correctness. Newspapers had value then.


It's even worse than that imho, unscrupulous sellers will go into keyword planner in google ads and create content to sell supplements and find "under-served" niches.

It's actively adversarial.


With web searches the reputation of sources can at least be discriminated. (Which seems to even help GPT-4 ?)


To be fair, this is also illogical, at least in a literal sense. It’s the Appeal to Authority fallacy. The reputation of the speaker doesn’t necessarily confirm or deny an assertion they make.

Granted, most people tend to use some combination of both logic and heuristics in practice to determine their own understanding of the truth, perceived reputation being a fairly common one.


> To be fair, this is also illogical, at least in a literal sense

Mostly in the same sense as arguments from induction are "illogical". Appeal to authority is essentially an inductive argument, and those are just as logical. Of course, they deal in probabilities rather than certainty, unlike deduction, but that doesn't make them illogical.


It does make it illogical, in the same way that “dealing in probabilities” makes many other things illogical. Like, for example, crime statistics.


All of physics, and science in general, is based on arguments from induction. If you want to call that illogical too, then, fair enough.


Sure, but so is listening to doctors(' websites). That's why there's a LOT of checks and balances around who gets to claim to be a doctor !


One of the most interesting things for me over the past 18 months was seeing the difference between this behavior in GPT-3.5 and 4 (especially the early versions of it).

You jumped from being extremely context sensitive to leading questions to almost ornery in its stubbornness and disagreement. Was one of the big early indicators to me significant differences were occurring between the two.

I really do wonder just how harmful the disconnect has been between people hearing accurate praise about GPT-4 powered agents and then interacting with GPT-3 level agents and assuming the same competencies.

GPT-3 was a very convincing text generator with lackluster reasoning capabilities, but 4 has significant reasoning capabilities and is less prone to parroting. But the majority of users are likely interacting with 3 because it's free.

It's been very hard to explain to lay audiences the vast difference between two similarly branded products.


While I agree the difference is vast, GPT4 still goes down the garden path a lot. If you're critical you can usually spot it's reason errors and challenge them, but it still also has enough issues that conversely I think in some areas there's a risk it'll be better at convincing people about the wrong things.


WebMD lists facts. ChatGPT will output things that sound plausible. It's like comparing a medical textbook with the town lunatic. Not the same.


Playing with the smaller free models, I’ve definitely found that they like to play along and mirror whatever you say. They’re great as a sounding board and active listener when you need to work through something, and they’re good for recalling facts that they know. They can be really useful when they glue together something you’re talking about and something relevant that they know. They’re terrible for anything that requires interoception or any kind of confidence estimation. They have no idea whether they’re recalling or confabulating.


It it’s too bad that this type of data beyond the medical record is not tracked along side it. Although many anecdotal associations are probably not correct for the broad population, there are likely nuances in the patients history and lifestyle which do make some more basic interventions helpful. ChatGPT surely picks up on some trends but doesn’t have enough data about the people where those trends are effective.

And of course its responses are highly subject to the prompts used


I tried using your template for a few items and none gave misleading results. Here’s one example:

Prompt:

Can loss of vision be caused by a meniscus tear and treated with rose hips?

Response:

A meniscus tear is a knee injury affecting the cartilage that acts as a cushion between the thigh bone (femur) and shin bone (tibia). This type of injury typically affects mobility and stability of the knee, causing pain, swelling, and difficulty in moving the knee joint. There is no direct physiological pathway through which a meniscus tear could cause loss of vision. Vision problems are generally related to issues within the eye or neurological conditions affecting the optic nerves or brain, not injuries to the knee or other joints.

Regarding the treatment with rose hips for a meniscus tear, while rose hips are known for their high vitamin C content and anti-inflammatory properties, they are not a recognized treatment for repairing a meniscus tear. They may potentially contribute to overall joint health and reduce inflammation, which could be beneficial in managing symptoms associated with joint injuries or conditions like arthritis. However, the primary treatments for a meniscus tear usually involve rest, ice, compression, elevation (RICE protocol), physical therapy, and in some cases, surgery, depending on the severity of the tear.

If you are considering using rose hips or any other supplements for joint health or recovery from an injury, it's important to discuss this with your healthcare provider to ensure it's safe and appropriate for your specific situation. They can offer guidance on effective treatments for your condition and whether any supplements could support your recovery.


This seems like a false comparison because you are bringing up entirely different parts of the body. Most dangerous misunderstandings start from a grain of plausible truth. For example, many people will choose a category of food/ingredient that causes every ailment known to man, and then cut out that ingredient from their diet. They will subsequently claim that skipping that ingredient changes their life, when in all likelihood the simple act of paying attention to where their food comes from and what went into it changes their consumption and behaviors in significant ways that have nothing to do with the bogeyman ingredient. They will then find a study that pumped 100g of said ingredient into a 200g mouse and point out all the negative effects that had which they are now avoiding.


I choose something that was much different to see if ChatGPT would just take my cue and create a connection. The closer I get to something that is more plausible the more likely there actually might be an obscure, but actually valid and useful, connection.

When ChatGPT 3.5 first became available I actually used it to diagnose an issue my father in law had that was missed by his surgeon and the ICU staff. Quite possibly saved his life. Finding these connections IMO is a plus and not a negative.


I guess chatgpt weights by empirical probability in text and not by studtly size or study design. Indeed it coming that pharma companies or other actors will invest in poisoning LLMs to better get already questionable treatments into the market and will spend less money inviting medical doctors into holiday resorts (my father was a medical doctor and it was a practice at least in the 80s with 'scientific' conferences back then)


Is this GPT4? There's a good chance that the other person used the free 3.5 which is significantly worse in these aspects.


here is chatGPT 3.5:

>A meniscus tear typically affects the knee and wouldn't directly cause loss of vision. However, there could be indirect factors such as reduced mobility or associated health issues that might impact eye health. Rose hips are not a typical treatment for meniscus tears or vision loss. It's important to consult with a healthcare professional for appropriate diagnosis and treatment options for both conditions.


I actually like this answer way better since I don't have to wade through three paragraphs of the medical version of corpspeak.


Yes. It’s GPT-4.


I think it's because those are completely unrelated


You can do the same thing with regular web search.


No, regular web search will just give you links. You have to click and go read the text. Putting together a comprehensive page of (seemingly) cogent arguments will need GenAI.


People like this will go and click those pages, and seek out communities that will agree with their preconceived notions.

ChatGPT results are just one mild manifestation of it. If and actually not if as an alternative they will find unhinged forums with echo chambers far worse than ChatGPT. At least ChatGPT isn't actively trying to rip you off to sell those supplements, and it usually adds disclaimers.


Not as quickly, easily, or convincingly.


You can probably even in worse ways find likeminded communities, which are likely much more unhinged than ChatGPT and lead you down a much darker path.


For sure, but then you need to post on a forum and wait for replies


There is likely enough existing information out and discussion out there. But ultimately he must have found it somewhere that this supplement could help, to go to ChatGPT with it.


Web search also wasn’t trained to be convincing to humans.


Web content was created to be enticing to humans. Most of it is just like flowers waiting for bees to come over.


Much of it is also paying good money to make sure their pollen is in all the places that bees congregate.


I think the big difference is that with web search there will be a couple of reputable sources that will be at/near the top, like web MD and Mayo clinic. I can search and click one of those and be fairly sure it'll be accurate. There is no immediate way with chat gpt for me to know if it is reliable or crank mode.


> He would phrase things like “Is it possible for {symptoms} to be caused by {condition} and could it be treated by {herbal product}?” Then ChatGPT would give him a wall of text saying that it’s possible, which he took as confirmation of being correct.

If you mine Pubmed, you sometimes find literature with opposite claims. So if a LLM is trained on Pubmed/PMC, it will repeat that kind of garbage.

You don't have to dig into "herbal products" it happens also in more conventional domains.

I once even found that kind of contradiction in articles where the "main author" was the same in both publications. It was about ALS and the "main author" was a prominent scientist from the USA who probably never wrote nor even read any of these articles.


One of the most basic prompts everyone should be using with questions like this is something like "provide detailed arguments both for and against the presented hypothesis."


This works when the person evaluating can distinguish between the subject matter being correct or not.

But the models can generate compelling devil's advocate positions, particularly when they confabulate supporting facts, which might appear convincing to non-experts.


Chatgpt is a continuator, of course it will mirror input . But i m sure that someone is training the adversarial persuasion machine that will fund the next internet with ads


It's the examples curated for RLHF. Not its autoregressive nature.


The same thing happens with web searching. If you provide a leading question in the query you are more likely to get results that confirm the thesis.

It’s surprisingly hard to ask open ended questions.


You could make a religion out of this.


People definitely will. There's also already a political party "lead" by an AI.


If only Philip K Dick had lived to see so many of his stories play out in real life.


No don't


[flagged]


The friend was already consulting with doctors (plural).

It is quite common for people with ambiguous presentations to take a long time to get a definitive diagnosis. Not surprising that some of them will be receptive to Cranks and GenAIs along the way.

Some years ago I was on a neurology ward (for something straightforward) and my neighbour had been trying to get a diagnosis for his problems for about a year. He finally got one but it took a panel of senior medics in a top hospital in a big city to crack it.


You should give the HN guidelines a read https://news.ycombinator.com/newsguidelines.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: