I’ve been doing something similar to control racing video games while biking on an indoor exercise bike. I convert pedal speed into brake and acceleration keys (there’s some middle amount of pedaling that’s neutral) with pynput. I used OpenCV for webcam input and MediaPipe’s models to convert lean angle into left and right keyboard presses, and head tilt into up and down. It’s really fun but the slight latency in recognizing my pose is tough for fast courses. Now I’m thinking about some Arduino with a sensor or low res IR camera or something else that could detect changes a lot quicker.
Considering his last (brilliant) project featured here, the semaphore keyboard madness, I'm not sure how serious this is but it seems there is potential for providing a useful API for AR/VR apps/games/experiences; I'm not sure the body is extensively served by Quest etc beyond hand and head tracking.
To me, the main motivation beyond "because I can" is the accessibility of these projects: you just need some Python code and any webcam. I'd love to play around more with VR, but the owners of the leading systems aren't exactly known for their openness to external APIs... Maybe Google Cardboard is worth a revisit :)
That's interesting, I always wanted to try something similar but put it off. I'll try it with my phone & see if I can get it working in small space but with ulta-wide camera.
Pretty well from my limited testing! For standing poses, it works roughly as well as a human in good lighting and contrast conditions - as in, in situations where you would have a hard time picking out where limbs are, Mediapipe will too. The more common issue is poor lighting, or clothing that blends into a busy background.