Advances in sensors computer vision algorithms have introduced interesting new ways to interact with technology. Body- and gesture-tracking for game consoles like Kinect, FOVE’s eye tracking, detailed hand-tracking like Leap Motion and Nimble VR. Most recently I was pretty blown away by Microsoft Research’s HandPose, demonstrating a low-latency, highly accurate and robust full hand and finger sensing from a single depth camera.
I think this is cool tech and hope folks keep researching it super hard. But I’ll tell you what: I don’t think hand and finger tracking is the main way we’ll be interacting with computers and especially in Virtual or Augmented Reality.
On the one hand, much of our interaction with the world involves hands and fingers. Humans have highly evolved hands and fingers wired with enormously complicated, nerve-dense, small-muscle-dense, fine-motor-control abilities to manipulate tools with our hands and fingers. We also have a highly evolved sense of proprioception – the knowledge of the positions of our body, and especially of our hands and fingers within arms-reach, even in the dark or with our eyes closed, even when we can’t see our body or hands. Our strongest sense of 3D and depth-perception given our binocular vision is within reach of our arms. It’s absolutely the case that our hands will be involved in input in VR and AR. But the reality is we manipulate our tools very, very subtly with learned tool-specific haptic feedback, very often out of our own sight using our proprioception. Our hands are rarely the tools themselves.
Forks, knives, spoons, bowls, dishes, stoves, faucets and sponges. Our hands manipulate these tools to prepare, eat, and clean up food. Learning to use most of these tools is pretty easy, and there are both fine- and gross-motor skills involved (but professional chefs can accurately chop hundreds of 0.5mm slices of vegetables).
Pianos, cellos, flutes. Our hands (and sometimes our mouths) manipulate these tools to create music, and there is an exceptionally fine level of arm, hand, and finger control and lengthy training involved to become skilled at making music – there may be a difference of 0.5mm or less in finger position and a few hundredths of Newtons of force difference between the correct and incorrect note played on a flute.
Keyboards, mice, touchpads, game controllers. These tools capture a tremendous amount of information from very small motions of our fingers, hands, wrists, and forearms. A keyboard keypress may involve your fingertip traveling 1-2mm, and typing whole sentences rarely involves fingers moving more than a few cm in any direction. Mice and touchpads allow very small pressures and sub-millimeter motions to accurately translate to complicated 2D interfaces. Gamepads are designed to accurately register joystick and trigger motions of less than 0.1mm radially or linearly at 60Hz or higher and skilled gamers can trigger multiple 1-2mm actuated buttons 30 times per second in brief bursts.
Mobile phone touchscreens and user interactions based on touch are fascinating tools combining the eye-hand coordination of mice with very fine spatial horizontal and vertical finger motor control and direct manipulation visual feedback. As a direct manipulation tool, touchscreens rarely take advantage of proprioception.
I think there will be some uses in technology and even in VR/AR for gesture-based, non-haptic interaction of the type we saw with Kinect and as we’re seeing with more accuracy and better hand models with LeapMotion and HandPose. But we are tool users. We hold our tools in our hands or we rest our hands on them or we lay our hands across our tools. And we move our tools around, sometimes out of sight behind us. We do this so that our tools amplify the millions of years of evolution that went into our brains, eyes, hands, fingers, and our sense of proprioception.
So far I’ve run across two tools that make sense in VR (and I’ve tried… all of them): The first is the game controller – the XBox controller and the Playstation DualShock4 controller both work well to give you a well-understood, input-dense tool which can be mapped to in-VR manipulations easily. I slightly prefer the DualShock4 since it also provides accelerometer and gyroscope sensor data, giving even more input. For gamers, it’s a positive that game controllers have an established haptic and proprioceptive model (gamers know the controller’s feel and layout without seeing it), but several negatives in my mind, primarily that gamers expect their left thumb to control motion, but also that your hands feel stuck together by this tool.
The second tool is the Valve/HTC Vive lighthouse wands (prototypes shown to the left, the actual devices are somewhat different). These controllers are, in a nutshell, unbelievably great. A highly accurate cross between a joystick and a touchpad beneath each thumb, a trigger beneath each pointer finger, and a gentle squeeze-actuated button on the barrel. Very clear haptic sense just in its shape. I have only used them a few times so I don’t yet have a sense of how dense the input can be, but it definitely feels right in VR to have both hands free and in a neutral position, to be able to use the fine motor control of your wrists, thumbs, and fingers to manipulate the environment, and to do so from any position you choose to put them (above your head, behind your back, etc). When you get a chance to try these, I think you’ll agree that it takes VR to a different level.