Maybe you should fact check your AI outputs more if you think it only hallucinat...

simianwords · 2025-08-16T16:46:13 1755362773

The accuracy is high enough that I don't have to fact check too often.

platevoltage · 2025-08-16T23:12:42 1755385962

I totally get that you meant this in a nuanced way, but at face value it sort of reads like...

Joe Rogan has high enough accuracy that I don't have to fact check too often. Newsmax has high enough accuracy that I don't have to fact check too often, etc.

If you accept the output as accurate, why would fact checking even cross your mind?

gspetr · 2025-08-16T23:35:29 1755387329

Not a fan of that analogy.

There is no expectation (from a reasonable observer's POV) of a podcast host to be an expert at a very broad range of topics from science to business to art.

But there is one from LLMs, even just from the fact that AI companies diligently post various benchmarks including trivia on those topics.

simianwords · 2025-08-17T04:44:23 1755405863

Do you question everything your dad says?

platevoltage · 2025-08-17T05:49:27 1755409767

If it's about classic American cars, no. Anything else, usually.

collingreen · 2025-08-16T17:23:52 1755365032

Without some exploratory fact checking how do you estimate how high the accuracy is and how often you should be fact checking to maintain a good understanding?

simianwords · 2025-08-16T20:07:45 1755374865

I did initial tests so that I don't have to do it anymore.

jibal · 2025-08-17T02:33:02 1755397982

Everyone else has done tests that indicate that you do.

glenstein · 2025-08-17T10:51:28 1755427888

And this is why you can't use personal anecdotes to settle questions of software performance.

Comment sections are never good at being accountable for how vibes-driven they are when selecting which anecdotes to prefer.

malfist · 2025-08-16T20:22:50 1755375770

If there's one thing that's constant it's that these systems change.

mvdtnz · 2025-08-17T00:07:52 1755389272

If you're not fact checking it how could you possibly know that?