
OpenCV in the browser using WebAssembly and web workers - aralroca
https://aralroca.com/blog/opencv-in-the-web
======
jonnydubowsky
Here's a link to a demo you can try in your browser. (not my demo).
[https://huningxin.github.io/opencv.js/samples/video-
processi...](https://huningxin.github.io/opencv.js/samples/video-
processing/index-wasm.html)

~~~
sax
This app lets you play around with OpenCV, like jsfiddle or codepen:
[https://cloudvision.app/](https://cloudvision.app/)

~~~
ehsankia
That's a... very misleading name. It's called "Cloud" vision, but then says:

> Cloud Vision does not track, record or send any images, videos, or
> information provided by the user to any server. All image processing is
> performed within the application.

------
eastendguy
Interesting, this is powerful. UIVision is using OpenCV/Webassembly for their
image recognition features: "...run automated visual UI tests inside the web
browser and on the desktop, powered by WebAssembly."
([https://ui.vision/rpa/docs/visual-ui-
testing](https://ui.vision/rpa/docs/visual-ui-testing))

You can install the free Chrome/Firefox extension to test it. In general, I
continue to be amazed how powerful the web assembly concept is.

~~~
pjmlp
Be amazed,
[https://www.youtube.com/watch?v=UQiUP2Hd60Y](https://www.youtube.com/watch?v=UQiUP2Hd60Y)

[https://opencv.org/opencv-ported-to-google-chrome-nacl-
and-p...](https://opencv.org/opencv-ported-to-google-chrome-nacl-and-pnacl/)

------
ArtWomb
Good stuff! It's nice to see your port of OpenCV to the browser succeed, and a
lot of people would be very interested in adopting this! But you may not see
more than 2x speedup over raw pixel manipulation of the 2D canvas.

WebGPU holds a lot of promise for fast image processing on the client. 10x
boost is not uncommon for RTX 2000 devices ;)

~~~
mauflows
Is there a webGpu based alternative that has the functionality of opencv?

~~~
sp332
Not sure of the status, but there is a project to get WebGPU working as a
backend for OpenCV.js
[https://summerofcode.withgoogle.com/projects/#53528827207352...](https://summerofcode.withgoogle.com/projects/#5352882720735232)

------
Endlessly
How would one benchmark or compare running OpenCV from native code and non-
native embedded code?

Ask since the intent of the code is not about embedding OpenCV in a browser,
but offloading the computation workload from +1 users from a server to the
endpoint user’s computer.

Might be wrong, but for a single user, this setup would likely not be optimal.

~~~
fest
This is probably the first time I am excited to hear about "X in webassembly".

While it'll likely be a lot slower than native implementation, the benefits of
an image not leaving your computer could unearth some interesting applications
(for example, I am very hesitant to use online OCR services and use them only
for data which is public anyway).

------
punarinta
Some time ago I've built an OpenCV-based real time masks plugin for my
videoconference tool but unfortunately had to limit it to a single thread WASM
version because of browser support. That resulted in 320x240 videos when mask
was on. However as an experiment I also ran a 8-threaded version locally and
its performance on a laptop from 2015 was more than enough for an almost 30fps
stream with a standard video size.

If anyone is interested you may check it out on
[https://xroom.app](https://xroom.app) or even contribute with ideas and
commits: [https://github.com/punarinta/xroom-
plugins/tree/master/nisdo...](https://github.com/punarinta/xroom-
plugins/tree/master/nisdos/masks). For the masks I'd still recommend a desktop
browser though.

If you're curious but don't want to bother clicking too much here's how it all
looks like: [https://imgur.com/93YR87e](https://imgur.com/93YR87e)

------
jjice
Does anyone have a good recommendation for resources to learn a bit about
computer vision? I don't want to go too in depth, but I'd like to learn the
basics.

~~~
canada_dry
If you know python, this is an excellent resource:

[https://www.pyimagesearch.com/](https://www.pyimagesearch.com/)

~~~
dr_zoidberg
In general, I like Adrian's site but over the years he has been locking more
and more content (even blog posts that used to be available in the site)
behind his courses. Not that there's anything wrong with charging for
knowledge (you know, teaching and getting paid for it), but it's a shame that
things that used to be available, and of great quality, have been "pay-walled"
somehow.

Another good resource is [https://learnopencv.com/](https://learnopencv.com/)
. Again, good quality stuff, if you know your ways it tends to be enough[0]
but it's also a big funnel to get you to but one of their courses.

[0] though Satya Mallick does hide a lot of complexity from the readers, and
that bites you if you try to implement things on your own

------
lukevp
This is really cool! WebAssembly is such a great concept and I can’t wait for
a better way to manage and preload all these wasm libs.

On my iPhone 11, it requested access to the camera and showed the image, says
it’s running at 60fps and is using the camera, but it only captured a single
frame.

~~~
auggierose
On an iPhone 11 it is such a waste though :-) Ported today some image
processing code from Swift to Metal. Speedup factor: 1000.

------
extesy
Does anyone know how to implement a virtual webcam in python on MacOS? I want
to implement something like zoom's background replacement but I can't find a
way to represent the output as a webcam that can be used as an input by
various conferencing apps.

~~~
ehsankia
This is definitely not optimal and would be a overkill setup, but OBS Studio +
VirtualCam plugin let's you basically screencapture anything and turn it into
a webcam device. So if your python app can display a video feed, you can
capture the window and show it as webcam (with OBS as overhead).

~~~
hughes
VirtualCam is not available on MacOS.

------
amelius
Why does this use a polling routine to check if the worker is
available/finished? Can't this be done more "elegantly"?

------
zwieback
Sounds interesting but it's a little unclear what's running at what layer - it
sounds like the JS code from OpenCV.js is now running in webassembly? And how
much of OpenCV is still running in native code, e.g. in prebuilt OpenCV
libraries?

~~~
icebraining
From what I can tell, the OpenCV C code is directly compiled to wasm bytecode;
the only JS part is some helper code to let you easily call wasm functions
from your own code.

------
shihn
I used OpenCV in WASM to create a RoughJS version of an image a couple of
years ago. [https://pshihn.github.io/rough-
draw/](https://pshihn.github.io/rough-draw/)

------
mbzi
Love this, OpenCV and all WebAssembly projects. I also use next-translate now
and then so kudos for that!

I have been putting AR in-browser when Java applets with JOGL was a thing!
I've been nominated twice this year for the Webby awards on AI and AR in
browser (1). Small innovative team who have been utilising Emscripten and
likeminded technologies for a few years from when Emscripten and WebRTC was
starting to be a thing.

I wanted to share some pain points taking this tech to production.

\- Bandwidth

This is huge with OpenCV, ~4.5Mb+ to take a picture is quite a difficult
bandwidth cost to accept. Especially the clients I worked with have millions
of views per day. The total binary for Max Factor VMUA (2) is the same size
which includes a large data set needed for a neural network for skin tone
analysis and face feature detection.

Learning: Do not include all of OpenCV. You don't need it all, but if you do
cherry pick the parts you need. I do recommend writing the simplistic parts
(this is for you who just use cv::mat!).

\- Speed

If you want a 60 FPS AR effect / AI algo on an Android device OpenCV isn't
always the fastest approach. Do not rely on a framework, you will need to get
your hands dirty and optimise/rewrite the slow areas. WebAssembly is fast, but
not as fast as the desktops and native environments you normally create this
code on.

\- Market

Not everyone has an iPhone in London. Bandwidth means seconds, JS and
WebAssembly execution adds to this. In a world where m-commerce is king this
does matter. Think Poland, middle of nowhere in Ohio, Brazil, etc. If it takes
60 seconds for a web app to run on 3g and then another 20 for the executable
to start, and then the experience is then sluggish it wont be commercially
successful.

\- UX

When you put this into a large site most traffic will come via instagram and
facebook. On iOS this is typically within a WKWebView which does not support
getUserMedia. Make sure you have some nice hints on how to open within iOS
Safari (or Android Chrome if the parent app has not enabled permissions).

Nevertheless I wish this blog post existed when I started out. I regret in not
writing something similar. In this post I especially love the simplicity of
the Emcripten pipeline which is great. It is a fantastic demo post. I do hope
it inspires many to play with this innovative stack.

1)
[https://twitter.com/Holition/status/1258068773623431177](https://twitter.com/Holition/status/1258068773623431177)
2) [https://www.maxfactor.com/vmua/](https://www.maxfactor.com/vmua/)

------
suyash
Nice tutorial, can you do one with Vanilla JS as I don't use React.

------
Chris2048
I wonder what the filesize of opencv.js is..

~~~
ThePadawan
It is linked at the end of the article:

[https://github.com/vinissimus/opencv-js-
webworker/blob/maste...](https://github.com/vinissimus/opencv-js-
webworker/blob/master/public/js/opencv.js)

7.75MB.

~~~
ape4
Actually not bad

