Hacker News new | past | comments | ask | show | jobs | submit | JeremyHerrman's comments login

Have you tested PaliGemma's OCR abilities? The article says it does well:

"In average accuracy, we saw 85.84%, beating all other OCR models except for Anthropic’s Claude 3 Opus."


It’s very good. And the cool thing is it’s made for fine tuning also. Excited to see how fine-tuned OCR models do.

From TFA, PaliGemma is competitive to GPT-4o and even beats it in terms of speed and OCR accuracy. It also can do object detection (bounding boxes) and segmentation which GPT-4V/o and Claude 3 Opus can't do at all.

Not to mention it's built to be fine tuned and commercially permissive!


This informative take deserves to be higher up the comment chain!


Why can't we resolve this with synthetic data? Just take the original statements and ask another LLM to flip around the relation, then train on both. I tried this with GPT-4 and it seems to understand the task [0]:

Me:

For each of the following relational statements, flip the relation around and generate a new statement or question answer pair.

Input: Bluey's father is Bandit

Output: Bandit's daughter is Bluey

Input: Neil Armstrong was the first person to step on the Moon.

Output: Who was the first person to step on the Moon? Neil Armstrong.

Now for the real statements:

- Valentina Tereshkova was the first woman to travel to space

- Who is Mary Lee Pfeiffer's son? Tom Cruise

ChatGPT:

Here are the flipped relational statements:

Valentina Tereshkova was the first woman to travel to space

Output: Who was the first woman to travel to space? Valentina Tereshkova.

Who is Mary Lee Pfeiffer's son? Tom Cruise

Output: Tom Cruise's mother is Mary Lee Pfeiffer.

[0]: https://chat.openai.com/share/33eb3ee4-2094-4748-b01e-0967af...


The difficulty would likely be that the “X is Y” relations don’t just occur in that simple, separated-out form, but in a whole spectrum of more complicated contexts, and the latter probably dominate. You would have to identify those and synthesize a text corpus that contains the reverse notions across a similar spectrum.


I don't think the reversal curse actually replicates though. When I looked into that it seemed like there was some flaws in the study and when I fixed those I couldn't get the same results as they did anymore. I keep meaning to write this up properly but never get around to it.

If you think about it, they're making an extraordinary claim given how fluently LLMs engage in language and reasoning.


great insight!


Here's how I usually describe what a migraine aura looks like:

It's just like the after-image you see after glancing directly at a bright light (like a light bulb's filament) then looking away or closing your eyes. Now imaging that after-image with a colorful scintillation along the edges, sometimes wrapping around a blind spot in your vision (not necessarily a black spot).

The cilia of a lobed comb jelly are close to what I'm talking about (video: https://youtu.be/LPu1juezWeg)

Auras are beautiful little harbingers of debilitation.


One other thing to note: although auras move and change over time, unless they appear in the center of your vision (fovea), you won't be able to directly "look" at one.


Right -- they happen in "screen space", not "world space", to use a graphics analogy.


One of the APIs powering the Pinball Map is the excellent Open Pinball Database (opdb.org).

OPDB came about because an older site IPDB doesn’t have an API (and doesn’t want to have an API) in order for various pinball software to communicate with each other. I'm in some pinball leagues with the developer, Andreas Haugstrup, who also makes matchplay.events, pintips.net, and other sites that serve the pinball community.


IPDB is run by an odd dude that refuses to modernize. God bless him, tho, his internet 1.0 website is invaluable.


Requisite link to SNL's 1988 skit "All Drug Olympics"

https://youtu.be/jAdG-iTilWU


Came here to also praise the 2400c. I finally restored mine to working condition just last week after finding a donor screen from a parts machine.

Excellent build quality thanks to IBM Japan who thankfully incorporated the inverted T arrow keys, a first for an Apple notebook, which still persists to today's MacBooks. The 2400c excellent build quality also doesn't suffer from poor hinges like the PowerBook 500/5300/1400c models.

The processor is on a daughterboard allowing the 2400c to be upgraded to a G3, and the modding scene out of Japan has brought a ton of interesting upgrades like translucent cases and keyboards.

The main problems the 2400c suffers from are leaky PRAM batteries and other issues causing the dreaded Green Light of Death (GLoD), where the machine won't boot without hardware replacements like a new processor card.


+1 for the 2400c. Picked one up when I moved to SF in early 2020 and restored it (SSD + more RAM).

The one complaint is that the keyboard (both US and Japanese) is just a tad small for comfort. The Wallstreet/Pismo keyboards were much better and a favourite of mine.


Do you have yours upgraded to the japanese g3 240mhz? (Or any g3?). Curious about how it performs.

Have you run BSD or anything on it or just classic macos?


This reminds me of a scene from the Simpsons [1]:

Homer: Hmm. I wonder why he's so eager to go to the garage?

Moe: The "garage"? Hey fellas, the "garage"! Well, ooh la di da, Mr. French Man.

Homer: Well what do you call it?

Moe: A car hole!

[1] https://www.youtube.com/watch?v=m5t9w98afJo


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: