It was painful, but I relish the idea of walking someone through a GUI to do it even less.
I eventually got through it by getting them on IM and giving them text to copy-paste, and sometimes they'd send me screenshots.
Your user's mental model is that they are looking at a black and white screen written in ancient Aramaic. They can't read it, they won't try to read it, and they cannot keep it in their working set while speaking with you or while engaged in other tasks such as, most relevantly, typing. They hope the demon on the phone will tell them the right magic spell, because this is so frustrating.
But irrespective of their inability to read ancient Aramaic, they can still identify color, location, and motion. So the clever demon will always phrase his requests to read ancient Aramaic in terms of color, location, and motion.