Real-time virtual puppeting has been done in movies/television and research for a while now and yes, it can easily fool people.
A professor friend of mine Jeremy Bailenson at Stanford actually uses the Kinect to track facial movements and uses 3D models of others to create puppets in real-time. Even more interesting, he can morph your face with the person you're video conferencing with to create a feeling of commonality in them.
He actually wrote a book on it called Infinite Reality [1] which talks about all kinds of ways people will probably get manipulated in the future. He talks about things like mirroring movements (which he can do automatically in a video conference), looking into the eyes of every participant in a group video conference and other really interesting psychological hacks.
It is amazing how DFW nails that in Infinite Jest. In the book, people have the technology to use video-phones, but they stopped using it after realizing that video-phones eliminate the convenience of being able to communicate without actually partaking in all other kinds of communication. A silly example: with a video phone, you must be paying attention at the person in the camera, can't talk to someone while clipping your nails.
Also in the book, people started using more technology to cover for these problems, from software that would make you look "presentable" to a complete avatar that would simulate all the motions that one should do while on camera.
In the end, people just realize that no one is actually using the video part of the phone anymore, so they just stop using it and go back to audio-only.
So, as much as I think we will have the technology to fool someone as you mention, I don't think it will happen simply because no one will adopt that technology in the first place.
As a user of video phones/laptops, what ends up happening is that you clip your nails anyway and keep on talking. Just like if a close friend was doing that while you were in the same room. It quickly becomes not a big deal.
I don't mean to pick on you, but what is so exciting about having a cam on to see (or not see?) your friends doing random things?
I see the value in having the occasional video conference, especially when talking with family or friends that are not geographically close. But I still see it as an attention cost. I would be slightly annoyed if someone that I'm video-chatting with decided to treat it as "not a big deal".
It's exactly a distance thing. Close friends and family that want to see you. It's like going over to a friends house and hanging out. You don't have to pay attention all the time, but it's nice to see facial expressions and such. You don't clip your nails for the entire conversation either, and you can stop and go within the conversation. Eventually video calls can take as much attention as an audio only call, and you can look at the person when you want to.
Funny. Just yesterday I had to use my girlfriend's laptop and (not being a mac user) I didn't know that Skype on Mac OS X has a "start video call" by default.
I had a skype call with a business associate, and it was quite bad when I realized that I was talking to him without wearing a shirt, and that the camera started by itself. Now I can only hope that I was fast enough to cancel the video before it actually started streaming.
Perhaps you are right in the sense that people will get used to the idea that "always-on" video is normal. However, I will not. I am as far as possible from being technophobic, it's just that I like the barrier of not being on display.
I find it helpful to ask not "can this fool people" but something more like "can this fool people at 320x200 with an X kbps stream?" Same for "realistic" computer graphics. I haven't seen anyone push computer graphics that can fool me at "HD resolution", but pushing something "photorealistic" at low-grade web cam resolutions is perfectly doable. I bet the same is true here.
> I haven't seen anyone push computer graphics that can fool me at "HD resolution", but pushing something "photorealistic" at low-grade web cam resolutions is perfectly doable. I bet the same is true here.
That's why I use this metric. Instead of a boolean "is/is not photorealistic", this gives us a way to measure progress.
I've seen some radiosity-based architectural renderings that have me fooled at about 800x600, I haven't seen anything in motion that has me at that resolution yet, and even the architectural stuff still breaks down at 1024+ (the models are still too clean).
I think this is the first time in my life that I've felt like I was living in a scifi novel.