Mapping screenshots to code is not hard. By having the model simply memorize the screenshots to code mappings of the training data can give you almost 100% accuracy (for some demo). What is hard is if given a new screenshot, how would this model generalize. To have something work for mobiles is a much easier task than having something work for other more complex UI though. Looking forward to seeing more updates on this!

Yep, for example, more dynamic UI such as tables, list of components (for example kanban swim lanes), etc...

