User Tools

Site Tools


gestures:fusion:interaction_point_index

Interaction Points

Gesture markup language has been designed, in part, to serve as a configuration script for multi-modal gesture engines. One of the concepts used to create a consistent logical framework for analysis is the interaction point. An interaction point can be thought of as a strongly typed object with a well defined position and direction (2D/3D vector) and a set of associated properties that define the detailed state of the interaction. Any input point clusters (multitouch points, skeletal hand points) or groups of device input points (wearables or sensors) can be represented as a single interaction point and analyzed for gesture events. Interaction points can change states based on the pose of a hand or the mode of the input device. In the most general of definitions, gestures occur when an interaction point undergoes a characteristic change that can be repeatedly and reliably recognized by the gesture engine. It can therefor be helpful to think of a true multimodal and cross-modal gesture engine as a general purpose interaction engine.

To learn more about the internals of the Multitouch GestureWorks Core Interaction Engine see: GestureWorks Core Wiki.
To learn more about the internals of the Multimodal GestureWorks Core Interaction Engine see: GestureWorks Core Multi-Modal.

Interaction Point Management

There are many types of interaction points. However, interaction points fall into three modal types: touch points, motion points and sensor points. These can be generated by hundreds of different device types and objects. In the following explanation we will focus on interaction points created from 3D hand motion tracking.

3D Hand Pose & Motion Tracking

High fidelity depth tracking devices can build a rich dynamic model of a user's hands. It has become standard practice to construct hand models using at least 22 points of reference that approximate the joints and bones of the hand. Most API's use a single point for the palm of the hand with a 3D position and normal vector along with five knuckles, five fingertips, two joints per finger, a single joint for the thumb, and point location for the wrist and arm. This allows the “tracked” hand model to present many different hand orientations, finger extensions and relative positions to create hundreds of distinct hand poses. Each hand pose can be characterized as a single interaction point state with unique real-time properties that can be passed through a motion analysis pipeline for gesture recognition.

Each hand is characterized by a single, independent, interaction point. When two hands are present, two interaction points are registered by the system and used to perform advanced cluster analytics for refined 3D manipulations. There are currently 8 types of basic hand pose types supported in GML (pinch, point, trigger, fist, flat, splay, thumb, hook) and within each pose type there are groups of poses. For example in the “point” type there are 4 simple configurations (index_point, index_middle_point, index_middle_ring_point and index_middle_ring_pinky_point). There are over 30 simple poses per hand with over 900 different pose pairs (left and right) to be considered for bi-manual interactions. For each pose there are are 4 different temporal types and 7 different motion types (with directional sub-groups [e.g. up/down]) with a total of over 24 possible temporal-spatial states. This leads to over 700 single-hand gestures and thousands of simple, bi-manual 3D hand motion gesture combinations that can be recognized (not including micro-gestures or multi-modal combinations).


Positioning & Migration

When working with hand motion tracking it is important to ensure that any hands that can be visibly tracked are treated as persistent interaction points in 3D space. As such, a default xyz location associated with any hand is the palm position. This location is commonly calculated as the tracked hand centroid and often includes vector data about the palm normal (direction perpendicular to the palm surface). In the GML interaction framework each tracked hand is first given the palm point position and vector and then passed state and property updates as new information about the hand pose is recognized. This provides a stable identification marker as well as robust interaction presence in 3D space that can be used to model rich physics-based and refined gesture-based interactions in a variety of application spaces.

A hand can only exist in a single classified pose at any given time. So, as the configuration of the hand changes, the interaction point state must change accordingly. When the hand configuration (relative finger positions and extensions) match a defined pose a pose state is exclusively defined for the hand. When this occurs, the default location of the interaction point associated with a palm must be updated to integrate the new pose-qualified interaction point properties. This often results in the migration (local movement) of the interaction point on the hand. For example: a trigger pose pulls the interaction point location from the palm to the finger-tip location and directs the interaction point vector from “normal” to the palm plane inline with the primary extended finger.

There are other examples of interaction point migration which can be seen as hand configurations change. For example: a fist pose to pointing pose.

Or: a fist pose to a flat pose.

When a hand transitions through a series of poses it is called an “interaction point chain”. This type of characteristic pose switching can be defined as a gesture sequence which can be recognized with a unique gesture event.


Hand Pose Transition Pathways

As a hand changes from one recognizable pose to another it can follow a number of defined transition pathways. Transition pathways are allowable configuration changes that follow normal kinematic motion, common mechanical hand skeleton constraints as well as ergonomic limits.

[image:hand pose state hierarchy]

In each case the pose context is used to refine the 3D position and 3D direction vector associated with the interaction. When tracking 22 bones in the human hand to establish pose, certain features carry more contextual weight than others.

Primary Hand Context Indicators Secondary Hand Context Indicators Tertiary Hand Context Indicators
relative thumb extension
relative index finger extension
finger-tip proximity
relative finger-tip angles
hand openness
hand object
finger extension
hand orientation
hand motion
hand sidedness
relative finger motion
hand splay
wrist flex
pose on other hand
previous pose
arm motion

Each indicator has a direct threshold value and weighted contributions along with category weights and confidence values associated with each tracked feature. These combine using a balanced formula* to create a pose probability value associated with the real-time hand configuration and passed through a fast decision tree (or other learning algorithm). A tracked hand can then be given an exclusive pose state based on mutually-exclusive configurations. During pose transitions, hand configurations can fall into an undefined state which is between poses. To avoid this, all undefined states that are not a recognized as a “relaxed” pose can be automatically categorized as a “transition” state. This acts as a catch-all, but the frequency of this state can also be used to indicate pose stability.

[image:simple hand pose transition pathways]

Pose Transition Pathways to Micro-Gestures

Once classified, interaction points update their properties and enter into a well-defined context sate. When certain context states are reached they can be used to further refine a subset of features used to search for qualified micro-motions and associated micro-gestures. For example: The micro-slide gesture can only be accessed when in a well-defined “index_point” pose. This requires the configuration of the hand to be known to a high degree of confidence so that the pose is defined with high probability. To do this, the hand features of interest (index finger and joints) must not only be highly visible but must also present a clear configuration (index finger fully extended with other fingers tucked into the palm). Only then will the micro-motion of the index finger tip (relative to the index knuckle point) be considered for a micro-slide gesture analysis. These managed requirements guide the user into performing more stable actions and naturally support (normally unreliable) sub-feature analysis.

To see more about microgestures see: Microgesture Index


Advanced Pose & Gesture Mapping

gestures/fusion/interaction_point_index.txt · Last modified: 2019/01/29 19:06 (external edit)