I am guessing you are not familiar with the AI-powered vision features that already ship since a few years. Mostly accessibility related, so I am not surprised you missed it.
Yep. Google, the AI company, only recently launched image descriptions in TalkBack, which VoiceOver has had for years now. Google still doesn't have Screen Recognition, which basically does OCR and image/UI classification to make inaccessible apps more accessible.
Don't get me even started on TalkBack and Android. It was never on-par with VoiceOver, and is still a few years behind... However, VoiceOver is also getting slowly, but surely, worse and worse over time when it comes to small subtle bugs...