I agree. Their assumption must be that OCR fails when a letter has covered regions, and to make it more readable to humans they provide the missing information in other time frames. But the letters have very nice borders and shadows, they are not deformed and have rigid motions which is trivial to track. Even in a single frame they provide more information than a traditional captcha. But the concept is promising and will definitely work with few modifications (e.g. no stroke color, time dependent deformations, not showing all letters in all frames etc.)