
Pix2code: Generating Code from a GUI Screenshot - visarga
https://github.com/tonybeltramelli/pix2code
======
Black-Plaid
Oh, man, _if only_ the designs I got were this simplistic and only used
standard controls with the standard look and feel.

This is a very interesting start, but it's a long long way from being able to
even represent in a simple layout what I am asked to do on a regular basis as
an iOS dev.

Maybe combine this with a PaintCode back-end and add markup for layout
behaviors?

~~~
melloclello
"Design hubris" is such a common problem for mobile/front end developers and
the source of so much wasted time I'm surprised it's not talked about more
commonly. Shit's expensive, man!

I am thankful that these days I get to work with talented designers who know
what a UITableView is and understand the limitations of the medium.

~~~
realharo
It's also funny when the design has some short placeholder text somewhere.
Then the actual real text ends up being longer and won't fit there, or will
look like crap.

~~~
tal_berzniz
[http://berzniz.com/post/14669116083/dont-be-a-liar-
and-4-oth...](http://berzniz.com/post/14669116083/dont-be-a-liar-and-4-other-
tips-for-ios-graphic)

------
codingdave
I wrote a system with this concept back in the early 00s, to let designers
drag/drop a design on-screen and then read the positioning of the dropped
elements to compute the HTML that would produce their design... it was a
dismal failure. What I found was that designers who wanted to work with the
web didn't want an easy-to-use tool. They either wanted to dig in and learn
HTML themselves, or just keep using Photoshop. And end users building personal
sites didn't want to innovate on their designs, they wanted to use pre-made
templates and just fill in their own text, change colors and images, etc.

Maybe the pervasiveness of computing has increased to where there now would be
a market for such things... it has been 15 years since I failed to find an
audience after all. And I wouldn't hold back on developing the technology, but
I'd put a decent amount of effort into a product fit before getting too
excited.

------
minademian
As an engineer, I'd be interested in the quality of the code generated by this
tool. I have too many nightmares from the early days of web dev where
generated code was a mess because FrontPage.

~~~
jjcm
It can be done, but it's rare. I actually really enjoyed working with
Expression Blend (Microsoft's front end xaml wysiwyg design program). The code
that it outputted was usable, which after seeing the things that dreamweaver
made was a breath of fresh air.

~~~
minademian
Agreed. Dreamweaver and FrontPage committed acts of mass monstrosity with the
code generated.

------
eriknstr
I submitted this to some subreddits and a couple of people made some quite
similar questions that I'd like to see a video for in case OP here is the
author or in case the author sees this.

>What would happen if I put something other than a clear mockup as input?
Replications?

>What happens if you try to scan console output [1] or this mess [2]?

1: [http://i.imgur.com/LJqJUnF.png](http://i.imgur.com/LJqJUnF.png)

2: [http://i.imgur.com/OMbZsgd.png](http://i.imgur.com/OMbZsgd.png)

------
radarsat1
Side rant: the page has a bibtex entry for arxiv. Look, it's all well and good
to publish early and often, but I find I still have some reservations.. I
mean, if I use this as a basis for some future work, obviously I will cite it,
but I don't particularly feel good about citing an arxiv publication. It
completely side-steps the peer review process, and I feel that in the long
run, that is bad.

You might say, well working code is working code, and sure I've cited
_software_ in the past, and having an article to go with it is even better,
but it's getting to the point that people are using arxiv not as a preprint
service but as a publishing platform. I find this frustrating for two reasons:
1) like I said, it skips peer review and even allows people to cite rejected
papers (for better or for worse), and 2) it makes the race to publish that
much more severe -- now if I wait until a conference or journal publishes my
work, I'm 6 months behind the guy who just uploads it to arxiv and is already
getting dozens of citations in current work.

So, perhaps this is not the place for this debate, but putting aside the fact
that preprint does seem like a useful way to "pre-publish", do you think it's
appropriate to _cite_ preprint papers and work? What are the implications for
computer science as a research field down the road, since "free for all" seems
to be taking over as a publication medium?

I know this will come off as being old fashioned, but I'm really worried about
where research publication is going in this field. It feels like a knee-jerk
reaction to first-to-publish pressure, rather than something that is a well
thought-out solution.

~~~
pcwalton
In reality, the competition isn't between "posting to arXiv" and "getting
peer-reviewed". The competition is between "posting to arXiv" and "posting to
your blog". If you succed in manufacturing a stigma against arXiv papers, then
you're just going to encourage people to make blog posts instead, which can
and do go down at any time.

~~~
radarsat1
> If you succed in manufacturing a stigma against arXiv papers

That's not exactly my intention, and your statement goes in line with what I
said..

> sure I've cited software in the past, and having an article to go with it is
> even better

where "software" could also include blog posts. It's just that, yeah, when it
comes to it if I have to choose between citing something some guy wrote on a
blog or citing Arxiv, I'll choose the latter every time.

But my question was not about whether arxiv is an appropriate medium for
publishing ideas, I'm certainly not arguing that it should go away, but rather
whether it's an appropriate thing to be citing in a scientific context. i.e.
in derived work.. should it be "okay" for people to publish to arxiv and
just.. leave it there and not put it in a conference or journal? Should such
work be validated by citation?

It's a legitimate question that I don't know the answer to. If you get an idea
from there, you can't just.. not cite it.. and yet, I feel like arxiv should
be used only as long as the work will eventually get a proper publication.
Which more and more it seems is not a given.

I'm really just responding to the statement on the posted link, which is just
a header, "Citation", and an Arxiv-bibtex entry. No "submitted to X..", "in
press", etc., or anything.

Again, it's not that I'm totally against the idea, but I think there must be
some happy medium between "peer reviewed work" and "well-written but
unreviewed article". Typically this used to be conferences, but conferences
are expensive, and with people uploading their pre-conference unaccepted
publications and _those_ getting citations, I mean.. where will this end?

I was a bit shocked recently while writing an ML article to find that the
"official" original reference, that you see cited everywhere, for the whole
concept of "style transfer" appears to be an arxiv paper [1]. My reservation
doesn't so much come from the fact that the authors put their work there, but
more that this is what people are actively choosing to cite over their peer-
reviewed conference publication [2]. When does "pre-print" become just
"print"?

(Currently Google Scholar reports 216 citations for the former and 74
citations for the latter. Of course they were published a few months apart so
it's not a great comparison but still, just an example..)

[1]: [https://arxiv.org/abs/1508.06576](https://arxiv.org/abs/1508.06576) [2]:
[http://www.cv-
foundation.org/openaccess/content_cvpr_2016/ht...](http://www.cv-
foundation.org/openaccess/content_cvpr_2016/html/Gatys_Image_Style_Transfer_CVPR_2016_paper.html)

~~~
Houshalter
I think the reason for this might be because a lot of machine learning
research and practice takes place outside of academia. Maybe there isn't the
same pressure to publish in official sounding places.

And people outside of academia often don't have access to journal
publications. Who wants to spend $30 per paper? That's just obscene. The arxiv
link is accessible everywhere, will never go away (probably), and can be
updated as the authors revise the paper. Why not cite it?

------
toisanji
I wrote a similar system and plan to release the source for mine over the next
few weeks. it produces working code that can be run on linux,mac,windows.

------
hardwaresofton
Did we collectively give up on building UI building/organization tools that
are easy enough for designers to use?

I understand that you can't control what tools designers use (whether sketch
or photoshop or MS paint), but it seems like building a tool they don't hate
using that builds UIs is the way simpler solution? There are already mockup
apps that include even basic functionality that designers use...

~~~
miguelrochefort
Microsoft Expression Blend is very good.

------
idibidiartists
It's best to make it so easy for a designer to make responsive (and even
adaptive) layouts (for both native mobile and web) that this AI-based
contraption that is guaranteed to produce wrong outcomes some of the time
becomes unnecessary, at least for any practical purpose (unless the goal is
the automated copying of web pages which is a lot bigger problem than guessing
the markup, given how much complex dynamic behavior is built into pages these
days)

I've taken an attempt at simplifying the task of building responsive and
adaptive React Native apps with the following little library.

[https://github.com/idibidiart/react-native-responsive-
grid](https://github.com/idibidiart/react-native-responsive-grid)

------
jaytaylor
Looking forward to when the code actually becomes available.. until then, it
is a cool demo video :)

------
ge96
Awesome before I got into web design and became a "full stack developer" I
wanted this same thing... asked about it on a forum. I'll have to read the
article/check it out to see if you literally do something like pixel mapping
or "Open CV" throw that in there to be safe. "Bit mapping?"

edit: I didn't ask if this was for web or app, it looked like it was for
applications... wouldn't know how to do layout on that but can do it with
HTML/CSS, maybe with Electron you could do that for Desktop apps, not sure
about Android/iOS though.

~~~
sebastian
The demo video shows two examples and iOS UI and a web UI, it looks like it's
using Bootstrap for the web UIs

~~~
ge96
Ahh thanks for the clarification, I probably should have watched it full
screen.

I suppose once you know how layout works with XML (Android) it wouldn't be
hard to translate... assuming you've got the layout down... interesting to see
their decisions whether to decide if thing are grouped together or work
separately/positioned independently like a menu icon. Unless it's absolutely
positioned and not responsive/dynamic.

At any case thanks

------
doodpants
Neat concept, but it wouldn't work for creating modern GUIs, where half the
controls are invisible until moused over (desktop), or hidden behind hamburger
menus and swipe gestures (mobile).

~~~
whatever_dude
Not only that, all those magical solutions are doing is solving the easy part
of development, which is laying down the widget-based layout. The actual hard
parts (different behavior based on context such as viewport size/orientation,
actual UI business logic) are an afterthought.

Impressive work, yet misguided in many ways in my opinion.

------
scraft
Interesting, wonder how it handles much more complex setups. I find a big part
of taking a design from Photoshop and getting it running on iOS/Android is
often more about thinking about what the constraints are, i.e. does an element
sit at an absolute distance from a screen edge, does it sit a specific
distance from another element, does it expand to fit an area, you obviously
have to do this to ensure it works at different screen resolutions. These
would certainly be hard decisions for AI to get right on a relatively complex
screen, but then again, maybe with enough training it could actually do really
well and solve these problems in a completely different way to how a human
would. There is also stuff like considering whether any of the information is
dynamic (text being received from a server) in which case elements needs to be
able to adapt to different sizes etc. again hard for AI to have any clue from
this from a photoshop image only.

------
seibelj
I think this is just creating the layout files, all the work of wiring it up
still has to be done by the programmer. This is really cool and probably a
glimpse of the future, but honestly it would probably be easier to train the
designer on how to use the drag and drop UI builders that already come with
these platforms.

~~~
hasenj
Why train someone to do grunt work when the machine can do it?

Translating picture designs into html files is not really that much fun.

Now the next step (probably more difficult) is you need a way to let the
programmers "hook" into the generated output in order to further tweak it or
customize it, but _without_ having to modify the generated files.

_That_ would make this technology really viable.

~~~
sebastian
It looks like a very good first step. I wonder how structured/semantic the
generated HTML would be for engeniers to "hook' the generated code with a
backend

------
visarga
Video here:
[https://www.youtube.com/watch?v=pqKeXkhFA3I](https://www.youtube.com/watch?v=pqKeXkhFA3I)

~~~
catmanjan
Very cool, I wonder if the web one ends up responsive given a bootstrap-y
model.

------
mwcampbell
A tool like this might produce a UI that looks right on the surface. But what
about things like accessibility (e.g. for blind users with screen readers)
that even many human developers don't get right?

------
tanilama
This is just image to layout. The code here is really some nested tree
structure. Even though it is still very interesting result, says it is
pix2code might be a little misleading.

------
octalmage
This reminds me of a tool AutoHotkey had that let you rip existing UIs. It
would read the window and rebuild the UI with the exact layout.

SmartGUI I think.

------
Stanleyc23
this would address a huge need in mobile UI if we could upload different
screenshots for different screen sizes e.g. iPad and iPhone or different
iPhone orientations and it would autogenerate an efficient set of dynamic
autolayout constraints, content hugging etc to achieve both UIs.

------
braindead_in
What about PSDs? PSD to html is a big market and a product like this could be
a good fit.

~~~
patates
PSDs are layered, and a correctly organized PSD file with some layer naming
conventions should contain enough information to generate layout code without
needing to use machine learning.

------
aub3bhat
By generating the "code" are you merely inferring / representing UI layout as
some kind of a tree. E.g. each button/panel/textbox gets detected, and an
"RNN/LSTM" generate a tree structure using attention.

~~~
avg_dev
"merely" is not the word I would use.

~~~
tanilama
If you somewhat familiar with DL literature, you will see this paper, while
having a very interesting angle, the underlying architecture is a standard,
enc-dec network, with encoder being CNN and decoding being LSTM. Such
application, has been studied before:

[https://arxiv.org/pdf/1609.04938.pdf](https://arxiv.org/pdf/1609.04938.pdf)

The above paper shows nice result that turns image to latex expression, and
image to html.

------
strin
How does this compared to WYSIWYG editors?

------
hasenj
I seriously hope that AI can soon take over the majority of work (usually
mundane) involved in create CRUD applications. Leaving the programmer with
simply customizing certain parts or writing very specific business or
validation logic.

It might drive down the wages of _some_ programmers but it will at the same
time free us to work on more interesting problems.

~~~
ensiferum
I'm hoping that it would just free us from work entirely ;-)

~~~
hasenj
Maybe OT but "free from work" is kind of an oxymoron.

You can't be free if you don't produce. You will be a slave to people who do
produce.

~~~
doesnt_know
If you lack the imagination to consider such a future, learn from the best and
read the Culture series of novels.

~~~
hasenj
Imagination is easy. Doesn't mean it's viable (or desirable).

You cannot imagine your way out of human nature.

You'd need at minimum the entire population to be of sufficiently high IQ and
high agreeableness (and low aggressiveness). This kind of population does not
exist, and probably will not exist in the foreseeable future. If it did exist,
it will probably be run over by another population of high aggressiveness. So
unless the population of the entire planet is as such, this will continue to
be mere imagination.

~~~
walterstucco
Mate, my grapes need attentions that I cannot give cause I'm working.

My home made spumante takes time that I have to take from other tasks, such as
laying at the beach

My craft beers won't make themselves

I have tomatoes, potatoes, beans and fruit to grow

And of course programming as an hobby

I would have so much to do, if I didn't have to work…

~~~
hasenj
AI will take care of your grapes and your craft beers will indeed make
themselves, as will your tomatoes, potatoes, and beans.

Laying at the beach will get boring after two weeks probably.

~~~
nl
I think there was a comment on HN yesterday relevant now: _oh the wonderous
things we could do with Haskell if we neither had nor needed jobs_.

------
colejohnson66
The video says the datasets will be available on the Github repository, but I
don't see anything...

~~~
jageen
Github repo contain that it will be available on this repository later this
year.

`To foster future research, our datasets consisting of both GUI screenshots
and associated source code for three different platforms (ios, android, web-
based) will be made freely available on this repository later this year. Stay
tuned!`

------
huula
Mapping screenshots to code is not hard. By having the model simply memorize
the screenshots to code mappings of the training data can give you almost 100%
accuracy (for some demo). What is hard is if given a new screenshot, how would
this model generalize. To have something work for mobiles is a much easier
task than having something work for other more complex UI though. Looking
forward to seeing more updates on this!

~~~
wiz21c
Yep, for example, more dynamic UI such as tables, list of components (for
example kanban swim lanes), etc...

