User Tools

Site Tools


gestures:fusion:microgesture_index

Micro-Gestures

A micro-gesture (microgesture) is a gesture that is created from a defined user action that uses small variations in configuration, or micro-motions. In the most general sense, micro-gestures can be described as micro-interactions. Being that micro-gestures use small variations in configuration and motion, gesture actions can be achieved with significantly less energy than typical touch, motion or sensor-enabled gestures. This has great benefits for users as it allows effortless interactions to occur in rapid succession which directly supports greater fidelity and productivity within natural user interfaces.

In many cases, micro-gestures (and micro-interactions) require greater tracking precision than is commonly available in commodity HCI devices by default. As a result, one of the primary challenges with integrating micro-gestures into any interaction schema is the ability to create smooth transitions between gesture to micro-gesture interactions. The interaction model (and multimodal interaction engine) that has been used to define the full GML standard can be used to provide reliable routes to micro-gesture and interactions. This is done using a combination of available gesture analysis methods: managed context qualification and dynamic feature fusion.

CamBoard, micro-flick Nod, micro-tap gesture Soli, micro-slide gesture

Micro-gestures provide a key ingredient to any high-fidelity touch, motion or sensor-based user interfaces and they enable users to fully engage with digital content in a consistent and familiar manner. Well-managed micro-gesture interactions are critical to free-form natural user interfaces as they provide a level of predictable behavior that can free user cognitive load to focus on content rather than control compensation.

Hand Based Micro-Gestures

Micro-gestures have been seen in desktop touch screens, track-pads and more recently in short range (radar) finger-tip tracking systems. There are a number of micro-interactions that can be enabled to benefit from long-range, 3D motion-tracking and associated micro-gesture recognition of fingers, hands, arms, faces, surfaces and objects in ever-greater interaction volume. (In this discussion we will focus on extended-range microgestures associated with the hand and fingers within a user-centric interaction field that extends 1m around the user).

Micro-gestures can only be accessed via well-defined pose transition pathways that match layered, cross-qualified context cues. This avoids accidental activation and serves as a strong delimiter for detailed feature analysis. For example: the “index_micro_slide” gesture can only begin (enter the gesture pipeline) once an index_point pose has been confidently established through persisting for a modicum of time and with limited global hand (palm point) motion. This allows the relative motion of the finger-tip and index finger joints to be selectively tracked and the relative velocity confidently calculated without indirect interference from self-occlusion or motion blur (within the limits of device tracking precision). For the user, this gesture sub-space can be used as an “interaction pocket” where minute actions can be used to reliably control a variety of subtle interface elements. These hi-fidelity interaction modes can be created in any comfortable location within the interaction field (tracking field of the device) which in turn provides a low-energy, ergonomically friendly, user-defined, dynamic interaction space.

As seen above, the micro-slide gesture uses the vector created from the linking of the index knuckle, and index tip position. This vector is directly compared to the index knuckle metacarpophalangeal (MP joint) to the tip, the index distal interphalangeal (DIP joint) to tip, the index proximal interphalangeal (PIP joint) to tip vectors along with the finger-tip normal. This allows a detailed measure of the index finger extension as well as the relative motion of the finger-tip during the slide action.

[image: micro-slide, joint to tip vectors)

Assuming the hand is relatively motionless, and a strong index point pose has existed for more than 300ms, only a fully (>=90%) extended index finger will qualify for micro-slide analysis. If the relative motion of the tip (to the knuckle) is perpendicular to the plane of the palm, the micro-swipe is considered to be vertical; if the motion is within the plane of the palm, it is considered to be a horizontal slide.

[image: y-tap vectors]

The micro-slide gesture is very similar to the index-y-tap (also known as the key-tap) gesture. Both gestures require a stable and persistent index point pose. However, only the relative motion of the finger tip is considered for micro-gesture analysis. If the tip undergoes a characteristic change in acceleration (jolt) then a tap action can be confidently matched and separated from a micro-slide action.

Elegant Degradation & Position Stabilization

If, during the micro-gesture analysis process, any primary qualifications are removed (hand moves too fast or finger becomes critically occluded), then the micro-gesture analysis will be skipped and the interaction point properties will transfer to default index pose properties or a pose transition state (whichever is most appropriate). In the case of an index micro-slide interaction point, this means simplifying the finger-tip vector to derive from the palm point-to-tip rather than knuckle-to-tip. However, if the index finger is suddenly lost, the interaction point will gracefully migrate toward the palm point location and stabilize to prevent any unpredictable interaction discontinuities.

Single Hand Micro-Gestures

* tip_taps
* finger_snap (click)
* joint_taps
* tip_slide
* finger_slide
* finger_micro_trigger
* finger_micro_flick


Extended Bi-manual Hand Micro-Gestures

When in general-purpose hand-tracking modes, current commodity depth-mapping devices often find bi-manual hand-tracking challenging, as one hand may occlude another hand during common use. Additionally, as one hand is placed in front of another, contour mapping becomes more complex and other skeleton extraction methods are required to isolate one hand pose from another. To do this effectively, hand trackers must use near-field methods to separate objects of interest along with context information about previously tracked hand states (whether hands existed before they merged).

Methods used to achieve this separation often rely on crafting and analyzing planes of depth space to build a basis for a skeletal model. When this is done well, flat hand surfaces can be used as reference planes to more reliably pick out finger-tip features and leverage the proximity of key features. This can be considered when defining bi-manual microgestures and available, high-confidence, tracked features. The following poses and actions use hand layering to define a natural set of cooperative bi-manual micro-gestures.

Bi-manual Micro-Tap Gestures

Bi-manual micro-taps use fingertip proximity and accelerated motion to identify the characteristic tap interaction on the surface of the hand, fingers and finger tips.

* tip_taps
* finger_taps
* posed_hand_taps
* wrist_arm_taps

Bi-manual Micro-Slide Gestures

Bi-manual micro-slide gestures leverage the derived hand surface even further by comparing the tracked motion of the finger-tip with the projected surface plane of the hand to look for co-planar motion.

* finger_micro-slides
* palm_micro-slide
* arm_wrist_micro-slide

As with the set of bi-manual micro-tap gestures, the micro-slide gesture set has a great advantage over other, free-form, 3D motion hand gesture and microgestures in that it can provide critical auto-haptic (tactile) feedback to the user as the surfaces (skin) of the hands touch. Auto-haptic feedback is an incredibly powerful way to introduce complimentary force feedback to the user via their own hands (collinear motion). This does require excellent tracking registration in order to be most effective, otherwise perceived inaccuracy or latency can have a negative impact on user experience.


Fusion Augmented Hand Micro-Gestures

In most cases, finger & hand micro-gestures are too subtle to be detected from current off-the-shelf HCI depth-mapping devices without careful sub-feature management to compensate for jitter or occlusion. However context-qualified high-fidelity data can be actively “fused” with wearables (or other depth-mapping devices) to create hybrid tracking systems capable of reliably discerning hand features and motion associated with micro-gesture actions. For example: most depth-mapping-based hand tracking devices are good at tracking the pose and position of hands, but, due to motion blur and occlusion, they present unreliable orientation, velocity and acceleration properties. Hand wearables (with integrated IMUs), such as rings and bracelets, can be worn without interference with depth-mapping-based hand-tracking devices and provide supplementary high-precision orientation and acceleration data which can be used to reinforce general hand feature- and gesture-confidence as well as build rich micro-motion gesture interactions.

Watch & Hand Pose Based Gestures

Wearable smart watches such as the "Basis Peak" can stream 6DOF accelerometer data via Bluetooth to connected desktops or smartphones. When combined with depth-map-based hand-tracking devices using feature fusion, key properties can be reinforced in the complimentary fashion. Positional drift can be removed from accelerometer data and refined orientation and acceleration values can be leveraged to produce a subset of watch-based micro-gestures. For example: A watch/bracelet or (arm/wrist) tap action can be recognized by both the hand pose and finger tip intersection (on the wrist) along with subtle spikes in accelerometer data due to the finger tap. This can give a tap gesture with greater reliability without the need for a touchscreen.

* watch_tap + pose (index, index-middle)
* watch_flick (fist, splay, flat, point, pinch, trigger)
* watch_tilt (fist, splay, flat, point, pinch, trigger)


The watch-tap micro gesture can be reliably achieved using context based fusion of two key features. First the finger tip of the “point pose” must be in the close to the watch location, second if the fingertip intersects with the watch surface and a “bump” or “jolt” is detected from the watch IMU. This allows the gesture to be build on both hand pose and sensor based fine motion.


A “flick” action is markedly different from a swipe or drag action in that it has a characteristic acceleration motion profile. A watch-flick micro gesture can be achieved in will minimal motion by establishing the pose of the hand and then using IMU data from the watch to precisely measure the 3D acceleration of the wrist.


A common problem with computer vision based AR systems (beyond registration errors) is the tracking precision associated with the relative orientation of complex 3D objects within a scene. When wearable devices are added to a user-centric AR system they can be used to improve hand and wrist tracking confidence. Once the IMU data from the wearable is integrated with a hand object (with a known location) 6DOF tilt data can be used to establish the relative plane of the wrist which is then used as a projection platform to present overlay content.

[image: pose activated precision tilt to create a stable MR projection reference (holographic watch/arm)]

When using advanced head-mounted displays (HMDs) with stereo LCD or “digital light field” displays, dynamic “virtual” content can be integrated directly onto the “real” wrist image with precision transforms (and virtual intersection detection) to create highly responsive (hand-tethered) holographic interfaces that can be controlled with bare-hand micro-gestures.


Ring & Hand Pose Based Micro-Gestures

There is a new generation of wearable smart rings such as the "Nod Ring" which are able to connect to desktop and mobile devices via (low-energy) Bluetooth to stream accelerometer, gesture and button data. When placed on the index finger, the Nod ring can be used as a direct control device but it can also be used in conjunction with depth-map or RGB based hand-tracking devices to create a unique set of fusion-augmented micro-gestures. For example: when the Nod ring is used with a Leap Motion device the ring does not interfere with hand tracking. As the Leap Motion device does not track proper pinch gestures (to the point where finger tips touch) it allows Nod button touch actions (involving manipulations of the index finger and thumb) to be performed without conflict. As a result, a large subset of micro-gestures can be directly enabled using the fusion of tracked features from Nod Ring and Leap Motion devices.

The Nod ring can provide direct feature data related to the finger it is worn on. As the index finger is one of strongest context indicators of the hand, when the Nod ring is worn on the index finger it can be used to selectively create complimentary micro-gesture sets for index_point_gestures, index_trigger_gestures, index_pinch_gestures and any pose indirectly involving the index finger while still supporting and augmenting thousands of existing Leap Motion gesture combinations using motion stabilization and providing increased orientation precision.

* point_ring_tap (index-point, index-middle-point)
* splay_ring_tap
* splay_ring_slide
* splay_ring_press (buttonA, buttonB)

By fusing Leap Motion hand features and states with existing Nod button states such as pad slide, button taps and presses, as well as roll, pitch or orientation changes, Nod actions can be pose-qualified to provide a more nuanced set of gesture interactions. For example: a fist_ring_press and splay_ring_press can be separated by pose type and mapped to entirely different gesture controls.

* ring_flat
* ring_point
* ring_trigger
* ring_pinch
* ring_first

Nod actions such as tap, double tap and slider press actions require a very small amount of time and energy to perform (comparable to a fast mouse click) this gives users access to motion-based micro-gestures that directly integrate to fast-twitch tactile controls. The high-precision orientation data provided by the Nod ring is not always used by all gestures, however, once fused with hand pose data, it is easy to create restful (data streaming) states that conserve battery and processing or simply create “locked” interaction layers that only activate using very deliberate (but low-cost) button presses. For example Nod Ring activated tilt-based bi-manual manipulations can be created to allow finely tuned transformation of virtual objects such as an axis-locked distortion or a stepped rotation of a view.

[image: Bi-manual Nod + Leap Motion virtual tool manipulation “hyper-precision mode dual-splay-press”]

These types of tool-based “hyper-precision modes” can operate in parallel with natural hand poses and yet still provide full bi-manual degrees of freedom without interference, jitter or occlusion discontinuities. This ensures reliable controls, and just as importantly, a critical measure of predictable hand/tool modal presence.

Microgesture Use-case Scenarios

There are a wealth of applications fro 3D microgestures within the HCI field. From gear-shift & steering-wheel microgestures in the automotive industry, to compact media player controls in the smart-home, low energy human robot interactions controls and rich AR/VR and MR menu/object controls.

For more information on micro-gesture use-case scenarios in VR, AR and MR see: VCML Microgresture Control Schemes

gestures/fusion/microgesture_index.txt · Last modified: 2019/01/29 19:06 (external edit)