

Extract numerical data from plot images - deniszgonjanin
https://github.com/ankitrohatgi/WebPlotDigitizer

======
jamessb
This is very nice.

I particularly like the automatic mode. However, the deafult 'distance'
threshold of 120 seems to be too large: selecting the yellow foreground color
picks points on both the yellow and red lines; selecting the green foreground
color picks points on both the green curve and the axes.

A solution for the test image is to just reduce the threshold to 50.

The best threshold may depend on both the target color and the specific image
being processed, but could potentially be automatically chosen: imagine
constructing a greyscale image by replacing each pixel's color with its
distance from the target color, and then picking a threshold for converting
this grayscale image to a binary image (e.g. using Otsu's method [0]).

The export to plot.ly feature is also great - you can really quickly extract
data from one graph and re-plot it in a different/clearer way.

[0]:
[https://en.wikipedia.org/wiki/Otsu%27s_method](https://en.wikipedia.org/wiki/Otsu%27s_method)

~~~
ankitrohatgi
Hey, I am the developer of this app. Thanks for the feedback on the color
distance. Also, I wasn't aware of the Otsu method, but have thought of doing
something very similar - I will give it a shot for sure. I have added an
enhancement item in the issue tracker.

~~~
jamessb
On further thought this probably won't work very well: the histogram of colour
distances will be multimodal, with one peak for each color in the image (and
with the peak corresponding to the white background being far taller than the
others). The Otsu method assumes there is just a foreground and background
that need to be separated.

A clustering method like k-means might work, but I've had a quick play in
Matlab and the results weren't great.

~~~
ankitrohatgi
Thanks for trying it out. I am also working on a grid removal algorithm which
will run into similar issues and so it's good to get some ideas. The naive
color distance approach might not be the best for that scenario.

------
acemarke
That is fantastic. I actually wrote a one-off utility to do exactly this from
polar plots several years back. This is WAAAAAYYYY better, in both features
and UI. I don't have a specific need for this now, but it's actually kinda
nifty to see someone else do this.

------
ankitrohatgi
Hello everyone, I am the developer of this app. I wanted to thank you for the
appreciation and I hope you find this tool useful in your work.

There are a lot of things that I want to add to this app in the future - e.g.
working with heat maps, path and area measurement, simple image editing tools
etc. Also while the auto extraction stuff works ok at the moment, I think
there is still a lot of room for improvement. I hope I can get to all these
things soon.

If you have any suggestions, then feel free to comment here, or use the issue
tracker on GitHub. You can also just send me an email.

------
filmor
Awesome tool :)

One minor thing, it could make sense to decrease the default colour distance
in the example, as with the current default setting the orange and red lines
are selected together in automatic mode when using the orange from the
"dominant colors". A distance of 78 or less separates them.

------
dmd
Neat. I've been using GraphClick [0] for this for decades.

[0]: [http://www.arizona-software.ch/graphclick/](http://www.arizona-
software.ch/graphclick/)

------
berryg
This brings back memories. I used to use these kind of programs during my
study electrochemistry more than 20 years ago. Apparently it is still
relevant.

------
balazsdavid987
Amazing work! Can it process plots with one data set (one line) with an all
white background?

