User Tools

Site Tools


gestures:fusion:input_fusion_index

Types of Input Fusion

There are three main categories or levels of fusion that we can consider in the field of HCI: Data Fusion, Feature Fusion and Context Fusion. Data fusion integrates raw input data of one form or another into a unified map or integrates raw input data into a single sensor unit of one type or another and should be considered “low level” as it occurs pre-processing. Feature fusion integrates qualified features of one form or another into singular object description and can be considered a “higher level” integration that occurs post-processing. “Context Fusion” occurs when different qualified features, poses or gestures are grouped into a unified outcome or event. This level of fusion also occurs “post processing”.

Data Fusion

Data fusion allows raw data to be directly integrated into a single data set before any features such as object meshes or skeletons have been characterized. Direct data fusion is fundamentally different from the feature fusion methods employed in GML and DML.

Point Cloud (Sensor) Fusion

There is another type of fusion often associated with motion tracking. This low level fusion technique as seen in the “Kinect fusion” framework uses “point cloud” raw data sensor fusion techniques that rely on direct depth map data integration. In “point cloud” sensor fusion the raw data from each depth map is carefully combined into a single integrated data set and then used to extract mesh data and define skeletal feature points. This direct sensor data fusion is fundamentally different from the feature fusion methods employed in GML and DML.

IMU (Sensor) Fusion

Other types of sensor fusion exist that use low level sensor data from other sensors such as 3-axis accelerometers and gyroscopes. When data from a coupled accelerometer and gyroscope are intelligently “fused” the resulting cross qualification of raw data creates a inertial measuring unit (IMU) device which provides inertial data with implicit orientation full 9DOF. This type of direct sensor data fusion is fundamentally different from the feature fusion methods employed in GML and DML.

Feature Fusion

Feature fusion enables the direct association of mid to high level feature data. Features can be fused from within the same input mode or between two distinct modalities. Fusing features within the same mode is called “Inter-modal Feature Fusion” an can be used to integrate features from multiple identical devices.

Cross-modal (Feature) Fusion

When using cross-modal fusion, feature data is combined from different modal input sources. For example complimentary tracked orientation features can be combined from tracked features wearable IMU's and tracked skeletal features from optical 3D tracking input such as with a Nod ring and Leap Motion device.

Inter-modal (Feature) Fusion

Feature data can be collected from multiple similar (or identical) devices. When these features are integrated into a single feature set it is called “Multi-device Fusion” or “Multi-instance Fusion”. When applied to 3D motion sensors, multiple identical and independent devices can be placed strategically to track object features (skeletons) and intelligently avoid optical interference through device coordination. More information about Multi-device Feature Fusion of 3D motion sensors can be found at http://www.deviceml.org/doku.php

Context Fusion

Context fusion enables the direct association of high level context qualified features or gestures.

State (Context) Fusion

A great application of context fusion is the direct mapping of multiple different hand poses to a single pose group which can then cause a single gesture event. For example: when the act of grabbing an object can be performed will multiple different hand poses: an index finger and thumb pinch, an middle finger and thumb pinch, a middle-index and thumb pinch, a five finger pinch or a full hand-fist grab. Each can map to the same object action which is considered a “grab”.

Logistically each would present a unique skeletal state and each would qualify as a unique skeletal pose. But the pose states would all map to the same grab gesture state and trigger a single interaction event. This fusion approach is particularly powerful as it allows layers of pose context to be strategically combined into a single gesture context and allows any pose conflicts or pose confusion to be managed directly. This type of redundant mapping avoids gesture ambiguity by connecting poses that appear similarly from different orientations or can be confused by tracking systems when hands are in fast motion. Additionally gesture ambiguity can be reduced if a user forms an irregular pose as the pose network has a greater chance of associating the hand pose with the “proper” pose. The overall effect creates a robust set of tracked fail-safe states that provide elegant degradation pathways to predictable gesture outcomes that can be used (in conjunction with other user context cues) preserve user intent.

Gesture (Context) Fusion

Independent gesture driven events can be directly associated using context fusion. This explicit association occurs after each stage of the tracking, feature characterization, qualification and gesture analysis has been processed. A good example of Gesture context fusion are gesture multitouch sequences such as: “Hold + Tap”. These two gesture events can be processed and directly associated so that a single “hold_tap” event is dispatch when a “Hold” event then tap event are registered. One can act as a context qualifier for the other. This of course can be extended to include other modal or inter-modal gesture events and sequences.

Another example can be seen in Bi-manual 3D motion gesture interaction. A right hand “pointing” gesture can be mapped cursor controls and a left hand “pinch” can be mapped to a key command such that “point and select” tasks can be performed (copy/paste). Using this method the left (non dominant) hand can context qualify the right hand (dominant) and significantly increase control bandwidth in a similar way to the now classic keyboard mouse input paradigm.


Interaction Point Index

gestures/fusion/input_fusion_index.txt · Last modified: 2019/01/29 19:06 (external edit)