Really great, congratulations, I hope that I can find a way to apply this lesson to my SaaS.
I assume YoHa means Your Hands... I don't think I could have resisted OhHi for hand tracking.
How does the application deal with different skin-tones?
Two hand support would be nice and I would love to add it in the future.
The engine should work well with different skin tones as the training data was collected from a set of many and diverse individuals. The training data will also grow further over time making it more and more robust.
The tech is all there, really it's just having the time and effort to get all the pieces together!
As a side note: The wasm files are actually from the inference engine (tfjs).
Please let me know if you have any more questions in that regard.
YoHa uses tfjs.js which provides several backends for computation. One indeed uses WASM, the other one is WebGL based. The latter one is usually the more powerful one.
I have not explored this space much so far as my focus is rather to build the infrastructure that enables such applications rather than building the applications myself.
Latency is very low which is very important for this use case. Look on YouTube for demos.
Hell, the library could even stitch the takes together, omitting the times when my hand started/finished doing the gestures.
Otherwise looks pretty impressive! I've been looking for something like this and I may give it a whirl
Web page doesn't say anything after `Warming up...` and the latest message in the browser console is:
Setting up wasm backend.
Just note that in the demo video, the user is 'writing' everything mirrored.
Edit: OTOH fingerspelling (https://en.m.wikipedia.org/wiki/Fingerspelling) might be a more feasible usecase!
I want something like this so I can bind hand gestures to commands.
For example scroll down on a page by a hand gesture.
However, you likely want this functionality on any website that you are visiting for which you probably need to build a browser extension. I haven't tried incorporating YoHa into a browser extension but if somebody were to try I'd be happy to help.
So I guess it would have to be sitting on my machine.
For example hand gestures to switch the desktop workspace.
Swipe left/right motion to switch desktop workspace. That would be the dream :)
Note that if you were trying iOS/Safari and not iOS/Chrome there is nothing that can be done due to a limitation that is documented in the section "Discussion" here: https://developer.apple.com/documentation/webkitjs/canvasren... Will document this.
So many educational uses, well done.
I can also see this being very helpful for people who have cerebral palsy, for example. Larger movements are easier, this might help someone use the web more easily.
Maybe if this was the input device that interacts with the standard web, then there is potential here, but it would be unfortunate if a company used this as a primary means of input.
Tailoring software that can use very general-purpose input equipment is much cheaper. Training a neural net to recognize one-handed gestures, for instance, could be done by one developer then deployed worldwide. Making a decent one-hand keyboard is way less easy and way harder to scale.
Imagine if your bank started using these to access your account and suddenly disabled customers could no longer use their adaptive input devices to interact with their account.
You end up with complicated systems trying to cover all of the edge cases.