I had a need to scan serial numbers from Apple's product boxes out of pictures taken by a clueless person on their phone. All OCR tools failed.
Vision model did the trick so well it's not even funny to discuss anything further.
"This is a picture of Apple product box. Find and return only the serial number of the product as found on a label. Return 'none' if no serial number can be found".
Vision model did the trick so well it's not even funny to discuss anything further.
"This is a picture of Apple product box. Find and return only the serial number of the product as found on a label. Return 'none' if no serial number can be found".