I tried doing this with the internal webcam, using a clip-on fisheye lens, and mirrors. And eventually punted, using the bare webcam only for head tracking, and adding usb cameras on sticks perched on the laptop screen. With more sticks to protect the usb sockets from the cables. And lots of gaff tape.
Leap Motion has finally been acquired, so the future of the product is unclear. And it's Windows-only (the older and even cruftier version supporting linux, doesn't do background rejection, and so can't be used pointing down at a keyboard). But it has apis, so you can export the data. My fuzzy impression is it's not quite good enough for surface touch events, but it's ok-ish for pose and gestures. When the poses don't have a lot of occlusion. And the device is perched on a stick in front of your face.
> Gestural stuff is nice when it's transparent and guess-able. Sadly not often the case
I fuzzily recall some old system (Lisp Machine?) as having a status bar with a little picture of a mouse, and telling what its buttons would do if pressed. And a key impact of VR/AR is having more and cheaper UI real estate to work with. So always showing what gestures are currently available, and what they do, should become feasible.
Even on a generic laptop screen, DIYed for 3D, it seems you might put such secondary information semitransparently above the screen plane. And sort of focus through it to work. Making the overlay merely annoying, rather than intolerable.
But when it all works, yeah, magical. Briefly magical. The future is already here... it just has a painfully low duty cycle. And ghastly overhead.