Hacker Newsnew | past | comments | ask | show | jobs | submit | epigramx's commentslogin

Peer review success is not the rule of the owner of a company but the acceptance you get from peers.


OCR is not AI


    AI is whatever hasn't been done yet.
        — Larry Tesler, 1970
Source: https://en.wikipedia.org/wiki/AI_effect


Yes but they're quite good at it. Reliable OCR is font dependent, whereas I think a lot of models just kind of figure it out regardless.


One reason I don't quite trust AI for OCR is that it will, on occasion, hallucinate the output.


All OCR is untrustworthy. But sometimes, OCR is useful. (And I've heard it said that all LLM output is a hallucination; the good outputs are just hallucinations that fit.)

A few months ago a warehouse manager sent us a list of serial numbers and the model numbers of some gear they were using -- with both fields being alphanumeric.

This list was hand-written on notebook paper, in pencil. It was photographed with a digital camera under bad lighting, and that photograph was then emailed.

The writing was barely legible. It was hard to parse. It was awful. It made my boss's brain hurt trying to work with it, and then he gave it to me and it made my brain hurt too.

If I had to read this person's writing every day I would have gotten used to it eventually, but in all likelihood I'll never read something this person has written ever again. I didn't want to train myself for that and I didn't have enough of a sampleset to train with, anyway.

And if it were part of a high-school assignment it would have been sent back with a note at the top that said "Unreadable -- try again."

But it wasn't a high school student, and I wasn't their teacher. They were a paying customer and this list was worth real money to us.

I shoved it into ChatGPT and it produced output that was neatly formatted into a table just as I specified with my minimal instruction ("Read this. Make a table.").

The quality was sufficient to allow us to fairly quickly compare the original scribbles to the OCR output, make some manual corrections that we humans knew how to do (like "6" was sometimes transposed with "G"), and get a result that worked for what we needed to accomplish without additional pain.

0/10. I'm glad it worked and I hope I never have to do that again, but will repeat if I must.


There was a good talk some years ago at some of the CCC events where some guy found out that scanners sometimes change numbers on forms.


It's David Kriesel's infamous talk about the even more infamous Xerox bug.

Talk: https://media.ccc.de/v/31c3_-_6558_-_de_-_saal_g_-_201412282...

Bug: https://en.wikipedia.org/wiki/Xerox#Character_substitution_b...


But AI can OCR


They do so by running the image through an OCR tool call


They can, sure...that's really just LLMs though.

ML models to recognize handwriting have existed way before LLMs could call tools, though

Identifying digits is like the "Hello World!" of ML

https://www.youtube.com/watch?v=aircAruvnKk


An OCR tool is ML. AI is generally used to mean LLM’s. You’re repeating what I already wrote


No they don't, they natively "see" images.


That's a thing I always marvel about - how LLMs are so versatile and do so much stuff so good that was out of reach just some years ago


Especially when you consider how expensive "good" OCR software is


On apple platforms it definitely is an AI. Apple intelligence!


AI says that OCR is AI.


God of the gaps


Not a big deal, because they tend to be trusted eventually by the search engines and the language models, though I don't trust much the latter to tbh.


US is the same country that allows a practical monopoly of NVIDIA on GPUs and Intel on CPU (or at least an oligopoly), and then pretend "foreigners are out to get us". It gets one to know one.


This is a US policy for US citizens. Of course they are protecting their own. China does the same. No Facebook or Google allowed in China.


China is a totalitarian dictatorship. The US is the land of the free.


Not the land of absolute free speech. Try owning a newspaper or TV station as a non-US citizen


On the other hand, I find this a bit concerning too? The USA is starting to look a bit more like China. There is now only “one world view” for us. Given the friend group between the people who run X and Meta and it might leave us in a precarious situation?

Banning TikTok only treats the symptom , the real disease is that people are way to susceptible to propaganda and misinformation.


Did you miss the part where China is a foreign adversary? They don't play nice. If you try to play nice with someone who wants to kill you then you get killed.


But I’m not saying China are nice guys, I’m saying we’re now left with the same thing, just ran by the US government. You might think that’s a good thing. I don’t.

Personally I think all closed source social media should be outlawed and all algorithms used should be audited by a third party.


End users just won't care about the algorithm. Try talking to a niece or nephew, especially one who makes money on the platform about The Algorithm and you'll get blank stares, or, at best, a "yeah I know, but...".

If you've had better luck, let me know (actually).

As for "being China", every country has protections on what goes in or out of the country including media. A lot of countries won't let you own a newspaper or news broadcast channel, so this is the next extension of that sort of idea.

It's the same idea as not allowing a company from the USSR to run a news channel during the Cold War, although obviously the lines are fuzzier and still being discovered with apps and algorithms.


We have nukes. If they try to kill us, everyone dies. "Foreign adversary" just means they're big enough to get a seat at the table in a multipolar world.


Dominant market leaders aren't inherently bad for the world. That's why anti-trust laws are narrow. Only when they are so ingrained and conspire to be anti-competitive (usually via lobbying gov policy to create barriers to entry) that they harm the ability for competition to replace them. NVIDIA constantly and perpetually have companies at their throats looking to take their market, which means they better deliver to customers.


It's the same country that allows a practical monopoly of NVIDIA on GPUs and Intel on CPU (or at least an oligopoly), and then pretend "foreigners are out to get us". It gets one to know one.


I reached the end.

Not sure that will help me end by stock market addiction. nice game.


It will happen, as long as people don't get language models are a glorified google search. They predict what they should say based on what they read.

If what the read is nonsense then they will predict nonsense.

They are basically glorified parrots.


> They predict what they should say based on what they read.

There's so much anthropomorphization in the air in these debates, that I worry even this statement might get misinterpreted.

The text generator has no ego, no goals, and is not doing a self-insert character. The generator extends text documents based on what it has been initialized with from other text documents.

It just happens to be that we humans have purposely set up a situation where the document looks like one where someone's talking with a computer, and the text it inserts fits that kind of document.


Yep. As I said in another post, they’re human simulators. It looks and sounds enough like a human that it’s tricking people into believing in this illusion of intelligence or intent. I have to imagine the very smart people at OpenAI and Anthropic understand this, and I think a lot of these reports about apparent sentience are being released to push the hype wave and generate investment before the truth becomes apparent to everyone.


I bet it still thinks 1+1=3 if it read enough sources parroting that.


And the catch-up of logic clock, how fast people catch-up we don't have Skynet within 2 years, but a glorified google search for the next 20 years.


Am I the only one who's getting annoyed of seeing LLMs be marketed as competent search engines? That's not what they've been designed for, and they have been repeatedly bad at that.


Yeah they're totally not designed for that. I'm also surprised that companies that surely know better market it as such.

Combined with a search engine and AI summarisation, sure. That works well. But batebones no. You can never be sure whether it's hallucinating or not.


If everything you do starts by asking an LLM, then you start with superficial research, because frankly it was never anything better than a fancy google search.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: