
Recognizing Graphs from Images - jspdown
https://www.yworks.com/blog/projects-optical-graph-recognition
======
Uhuhreally
I would pay for an iOS app that I could point at a graph I've drawn and get
back an SVG together with a data structure representing the graph

------
fedups
It sounds cliche, but I've found the yEd graph editor crucial in reading a
Dostoevsky novel and keeping up with all the characters and their relations.
Is yworks pretty much the only game in town when it comes to graph editing?

~~~
ygra
(Disclaimer: yWorks employee)

If you still have that graph somewhere, we'd love to see it. We're always
curious (and sometimes surprised and astonished) what people create with yEd.

As far as competitors go, there are lots of other options, both in end-user
applications for graph editing, as well as libraries.

For end-users it seems many stick with the first tool they really like and get
used to its features, strengths, and idiosyncrasies (and from my experience
there are many weirdnesses among those applications, including our own).
Automatic layout may be a killer feature for our offering, though. As far as I
know there is not much that can compare here (although for many people simple
hierarchic or force-directed approaches may suffice and they might not need
every option).

For library users it often comes down to a decision based on required
features, cost, custom development effort, and target platform. I think we're
well-situated for customers where cost is less of an issue, that have
competent developers and require extensive customization (and support). It's
not uncommon that D3 might be a better choice, depending on the requirements.

~~~
fedups
Thanks for the response! FWIW, I find nothing lacking in the application, but
I'm working on an ipad and it seems a native app would be less constrained by
the browser quality.

------
awinter-py
chart recognition and OCR in general would make UGC sites, in particular
wikipedia, much more powerful. wikipedia is full of uploaded charts that
should be datasets.

in general chart sharing on the web is bad. If we can't have a <chart>
element, maybe chart parsing is the next best thing to preserve some of the
original information

~~~
ygra
I think Wikipedia has macros that allow you to create a variety of charts with
just textual information. I've seen at least timelines and family trees being
created that way. The benefit is that it's text-editable, and the information
is still accessible. In some cases, the markup is very, very unreadable,
though and might actually be tool-generated. It still maintains the benefits
in theory.

For things that are trivial in SVG or even HTML, like bar or line charts,
there should be little to no excuse not to use markup, agreed.

In this case, we've concentrated on parsing graphs, though, as that's our main
line of work (you could draw charts with yFiles, but it's not really that
useful for it). We're still trying to find the time to clean up our code and
publish it somewhere. This has been just a week-long effort by four people,
but was definitely a fun learning experience.

------
ygra
yWorks employee, developer and one of the authors of that blog post (and the
described code) here. Happy to answer any questions regarding graph drawing or
recognition.

~~~
burning_hamster
Would you guys mind open sourcing your data set, including the cases where you
currently fail (presumably graphs with edge crossings -- I couldn't help but
notice that all examples are planar graphs)? I think I may have some ideas how
to tackle some of the unresolved issues you state at the end of the blog post.

~~~
ygra
Yeah, we intentionally omitted edge crossings. There's prior research in
dealing with them and those approaches work well (and are clever), so we
thought it wouldn't be terribly useful for us to re-invent that part. The
papers are linked in the blog post near the end where we compare our approach
with previous attempts.

We've then concentrated on different segmentation strategies (the other
approaches started with a sensible binarization of the image, which precludes
color-based segmentation), as well as getting visual characteristics right,
such as shape and color. The algorithms still don't handle cases well where
features are much larger than we expect (e.g. photos are very different than
screenshots). That'd be certainly an area of improvement.

Our data set is ... basically most of what the screenshots in the article show
(one of them opens an album with more images). We didn't have time for a
thorough testing of thousands of different graphs. That's something we'd
certainly have to do if we'd want to publish anything ;-)

------
anonytrary
The title led to be believe this would be about extracting relational
hierarchies (e.g. scene trees, dependency graphs) from arbitrary images. That
would have been very impressive, and somewhat unbelievable. The actual article
appears to be about extracting graphs from images of literal graphs.

