This recipe builds upon basic hand tracking to demonstrate advanced techniques using Torin Blankensmith’s MediaPipe TouchDesigner plugin. Learn to create complex gesture recognition systems, multi-hand interactions, and sophisticated visual controls.
Based on: Torin Blankensmith’s MediaPipe TouchDesigner series Plugin: github.com/torinmb/mediapipe-touchdesigner
Overview
Advance your hand tracking skills with:
- Multi-hand tracking and interaction detection
- Complex gesture recognition (swipes, pinches, rotations)
- Hand-based particle and physics systems
- Performance optimization for multiple hands
- Integration with 3D objects and particle systems
1. Setup & Plugin Configuration
1.1 Install the Plugin
- Download the latest release from github.com/torinmb/mediapipe-touchdesigner/releases
- Extract the
release.zipfile - Place the
toxes/folder alongside your TouchDesigner project file - In TouchDesigner, press
Taband add aMediaPipeCOMP - Enable “External .tox” and load
MediaPipe.toxfrom the toxes folder - Add
Hand Tracking.toxto the same network
1.2 Configure for Multiple Hands
- Select the MediaPipe COMP
- In the parameters, enable Hand Tracking in the model list
- Set Max Num Hands to
2(or higher if needed) - Disable other models (Face, Pose, Object, etc.) to save performance
- Optionally enable Preview Overlay to see landmarks on the video feed
- Set your webcam as the video source
2. Understanding Multi-Hand Output
When tracking multiple hands, the plugin prefixes data with hand identifiers:
2.1 Hand Identification
- Hand 1:
H1_<joint>_<axis> - Hand 2:
H2_<joint>_<axis> - And so on for additional hands
2.2 Available Data Per Hand
- 21 landmark points per hand (x, y, z coordinates)
- Gesture confidence values (open palm, pointing, thumbs up/down, etc.)
- Helper channels:
H1_pinch_midpoint_xyz(midpoint between thumb and index finger)H1_pinch_distance(0-1, where 0.05 = tight pinch, 0.25 = open hand)H1_spread(hand openness, 0 = fist, 1 = fully spread)H1_<gesture>_confidencefor 7 built-in gestures
2.3 Additional Tracking Data
- Hand presence confidence (indicates if each hand is detected)
- Handedness classification (left vs right hand)
3. Multi-Hand Interaction Systems
3.1 Distance-Based Interactions
Detect when hands come close together or move apart:
-
Extract Hand Positions:
- Add two
Select CHOPs: one forH1_palm_center_xyz, one forH2_palm_center_xyz - (Calculate palm center by averaging wrist and MCP joints, or use available helper channels)
- Add two
-
Calculate Distance:
- Use a
Math CHOPto compute Euclidean distance between the two position vectors - Formula: sqrt((x2-x1)² + (y2-y1)² + (z2-z1)²)
- Use a
-
Map Distance to Visuals:
- Use distance to control:
- Size of a connecting visual element (line, beam, etc.)
- Color hue (close = warm, far = cool)
- Particle emission rate between hands
- Audio frequency or filter cutoff
- Use distance to control:
3.2 Gesture-Based Interactions
Recognize specific multi-hand gestures:
Clap Detection
- Extract
H1_palm_centerandH2_palm_center - Calculate distance between palms
- When distance < threshold AND both hands have low velocity → clap detected
- Add
Lag CHOPandTrigger CHOPfor clean pulse output
Prayer Hands Detection
- Check if both hands have high
F1_palm_center_y(high position) - Check if distance between palms is small
- Check if both hands have low spread values (fists or partial fists)
- Combine conditions with Logic CHOPs
Push/Pull Gestures
- Track distance change over time:
- Subtract previous frame distance from current distance
- Positive value = pushing apart, negative value = pulling together
- Use this delta to control:
- Force strength in a particle system
- Scale of a 3D object
- Feedback amount in a visual effect
4. Advanced Gesture Recognition
4.1 Swipe Detection
Detect horizontal or vertical swipes with one or both hands:
-
Velocity Calculation:
- Track position over time to calculate velocity vectors
- Use
Math CHOPwith derivatives or manual frame-delay subtraction
-
Direction Detection:
- Compare velocity components:
- |vx| > |vy| and vx > threshold → right swipe
- |vx| > |vy| and vx < -threshold → left swipe
- |vy| > |vx| and vy > threshold → up swipe
- |vy| > |vx| and vy < -threshold → down swipe
- Compare velocity components:
-
State Management:
- Use
Delay CHOPandLogic CHOPsto create swipe state machine - Require hand to be above velocity threshold for minimum duration
- Reset state when velocity drops below threshold
- Use
4.2 Rotation Detection
Detect circular hand movements:
-
Angular Velocity Calculation:
- Track position history to calculate angle change over time
- Use multiple
Delay CHOPs to get position at t, t-1, t-2 - Calculate angle between vectors (current-previous) and (previous-previous-previous)
-
Direction Detection:
- Positive angular velocity = clockwise rotation
- Negative angular velocity = counter-clockwise rotation
-
Applications:
- Control hue rotation in a color wheel
- Drive parameter knobs in audio synthesizers
- Rotate 3D objects in scene
- Control scroll position in long lists
4.3 Pinch and Spread Tracking
Beyond basic pinch distance:
-
Pinch Velocity:
- Calculate rate of change of
H1_pinch_distance - Fast closing pinch = grab/release gesture
- Slow opening pinch = gradual parameter control
- Calculate rate of change of
-
Multi-Finger Gestures:
- Track distance between index and middle fingers
- Track spread of all fingertips (variance of fingertip positions)
- Detect “rock on” gesture (index + pinky extended)
- Detect “spock” gesture (index-middle split, ring-pinky split)
5. Hand-Driven Particle and Physics Systems
5.1 Particle Emitters from Hands
Create particle systems that emit from hand positions:
-
Basic Setup:
- Use
H1_palm_centerorH1_pinch_midpointas emitter position - Connect to
Source POP→Solver POPchain - Render with
Point Sprite MATor instanced geometry
- Use
-
Parameter Mapping:
- Map
H1_pinch_distanceto particle emission rate (close pinch = more particles) - Map
H1_spreadto particle initial velocity (spread hand = faster particles) - Map hand velocity to particle initial direction/velocity
- Map
F1_thumb_up_confidenceto enable/disable emitter
- Map
5.2 Hand-Controlled Forces
Use hands to affect particle systems or physics simulations:
-
Attractor/Repulsor:
- Map hand position to
Force POPcenter - Map gesture confidence to force strength (thumbs up = attract, thumbs down = repel)
- Use both hands for bipolar forces (one attracts, one repels)
- Map hand position to
-
Wind Control:
- Map hand velocity/direction to
Wind POPdirection and strength - Use palm orientation (calculated from landmarks) for wind direction
- Map hand spread to wind turbulence/noise
- Map hand velocity/direction to
5.3 Soft Body Interaction
Create deformable objects that respond to hand touch:
-
Distance Field Approach:
- Calculate distance from hand to each point in a soft body SOP
- Apply displacement inversely proportional to distance
- Use
Attribute Create SOPto store and modify point positions
-
Direct Position Driving:
- For low point count SOPs, directly drive nearby points toward hand position
- Use falloff based on distance to prevent rigid movement
- Apply smoothing/lag to prevent jitter
6. Performance Optimization for M1 Pro
6.1 Model Configuration
- Only enable Hand Tracking - disable Face, Pose, Object models
- Set Max Num Hands to the actual maximum you need (2 is usually sufficient)
- Consider lowering
model_complexityto 0 for maximum performance (less accurate but faster)
6.2 Resolution Settings
- Hand Tracking works well at 640×480
- Can go down to 320×240 for static or slow-moving interactions
- Higher resolution (1280×720) only needed for fine finger tracking at distance
6.3 Data Processing Efficiency
- Only extract landmarks you actually need (use specific Select CHOPs)
- Calculate helper channels (like palm center) once and reuse
- Use
Null CHOPsto avoid recalculating the same expressions - Chain Math CHOPs efficiently (combine multiple operations when possible)
6.4 Visualization Optimization
- For hand visualization: use low-poly spheres (~16 segments) or instanced circles
- For trails: use Feedback TOP system rather than accumulating geometry
- For particle systems: optimize POP SOP cook resolution and particle count
- Consider rendering hand visualization to smaller TOP then compositing
6.5 Reducing Data Flow
- If only using one hand’s data, add a
Logic CHOPto zero out the other hand’s channels when not present - Use
Switch CHOPto select between hands based on confidence or other criteria - Implement dead zones to ignore micro-movements when hands are supposed to be stationary
7. Practical Applications
7.1 Conducting System
Create a virtual conductor that controls music or visuals with hand movements:
- Left hand position/volume → audio gain or visual intensity
- Right hand position/tempo → beat detection or animation speed
- Conducting patterns (down-up-left-right) trigger different sections
- Gesture size controls dynamics (big gestures = forte, small = piano)
7.2 Two-Handed Sculpting
Create a virtual clay sculpting system:
- Left hand positions and orients the virtual workplane
- Right hand adds/subtracts material at intersection point
- Pinch distance controls tool size
- Hand rotation controls tool orientation
- Use feedback loops to accumulate sculpting changes over time
7.3 Interactive Particle Orchestra
Map different hand gestures to different sound-producing particle systems:
- Open palms → ambient pad particle system (slow, diffuse particles)
- Pointing fingers → lead synth particles (focused, directional)
- Clasped hands → bass particle system (large, slow-moving particles)
- Fingers spread → high-frequency particle system (fast, chaotic)
- Each system mapped to different audio frequencies or synth parameters
7.4 Collaborative Interaction
Enable multiple users to interact with the same system:
- Track hands from multiple users (if using multiple cameras or wide-angle lens)
- Assign colors or IDs to each user’s hands
- Create interaction zones where hands from different users can “collide” or “merge”
- Use hand proximity to trigger collaborative effects or shared visual elements
8. Parameter Reference
| Parameter | Location | Typical Range | Purpose |
|---|---|---|---|
| Max Num Hands | MediaPipe COMP | 1-4 | Number of hands to track simultaneously |
| Model Complexity | MediaPipe COMP | 0-1 | 0 = fastest (less accurate), 1 = default |
| Detection Confidence | MediaPipe COMP | 0.5-0.9 | Minimum confidence to detect hand |
| Tracking Confidence | MediaPipe COMP | 0.5-0.9 | Minimum confidence to keep tracking hand |
| Gesture Threshold | Expression CHOP | 0.6-0.8 | Confidence threshold for gesture triggers |
| Lag Time | Lag CHOP | 0.02-0.08s | Smoothing for jitter reduction |
| Velocity History | Math CHOP/Delay | 2-5 frames | How many frames to calculate velocity over |
| Interaction Distance | Math CHOP | 0.1-0.5 units | Distance threshold for hand interactions |
9. Performance Tips for M1 Pro
- Limit Hands: Only track as many hands as you actually need
- Lower Complexity: Set model_complexity to 0 unless you need high precision
- Smart Data Extraction: Only pull the 5-10 landmarks you use in your patch
- Cache Calculations: Calculate complex values (like palm center) once and reuse
- Efficient Math: Combine multiple operations in single Math CHOPs when possible
- Visual LOD: Use simpler representations when hands are far from camera or moving fast
- Monitor Frame Time: Use Dialogs → Performance Monitor; maintain <33ms per frame for 30fps
- Batch Similar Ops: Group similar mathematical operations to reduce CHOP count
10. Related Techniques
- (y-) Hand Tracking Tutorial — Basic hand tracking setup
- (y-) MediaPipe Face Tracking for Interactive Expressions — Facial expression tracking
- (y-) MediaPipe Pose Tracking for Full-Body Avatars — Full body tracking
- (y-) Hand-Tracked Chaotic Attractor — Chaos system from hand data
- (y-) Particle System with POPs — Advanced particle systems driven by hands
- (y-) GPU Fluid Simulation — Use hands to control fluid parameters
Parameter Tuning & Behavior
| Parameter | Behavior |
|---|---|
| Max Num Hands | 1-4; tracking more hands requires more CPU/GPU resources. |
| Model Complexity | 0 = fastest performance; 1 = highest accuracy for fine finger details. |
| Gesture Threshold | Higher = requires a very clear hand shape to trigger; Lower = more “lenient” detection. |
| Interaction Distance | Determines how close two hands need to be to trigger a “clap” or “merge” effect. |
Network Architecture
To visualize how the advanced multi-hand data flows, here is the final network map:
[ VIDEO INPUT ] [ MEDIAPIPE PLUGIN ]
Webcam TOP ──────────────────▶ [ MediaPipe.tox ]
│ (Max Hands = 2)
▼
[ DATA DECODING ] [ Hand Tracking.tox ]
│
▼
[ CHANNEL SEPARATION ] [ Select CHOP ] ─────────────────┐
(H1_* and H2_*) │ │
▼ ▼
[ GESTURE LOGIC ] [ Expression CHOP ] [ Math CHOP ]
(Claps / Swipes) (Confidence Logic) (Distance Calc)
│ │
▼ ▼
[ CONTROL SIGNALS ] [ Logic / Count CHOP ] ◀─────────┘
│
▼
[ VISUALS ] [ POP Network ] ──▶ [ Geo COMP (Instancing) ]
(Emit from Hands) (Scale by Distance)Data Flow Explanation
- Multi-Hand Sourcing:
MediaPipe.toxis configured to track up to 2 hands. It output channels prefixed withH1_andH2_. - Distance Calculation: We use a
Math CHOPto compute the Euclidean distance between Hand 1 and Hand 2. This distance becomes a control signal for visual scale or audio frequency. - Gesture Engine: The
Expression CHOPmonitors velocity and position to detect “swipes” (high velocity in one direction) or “claps” (proximity + low velocity). - Feedback Control: Gesture triggers (like a clap) are sent to a
Count CHOP, which cycles through different visual modes or resets a particle system. - Particle Driving: The
H1_pinch_midpointandH2_pinch_midpointare used as emitter positions in aPOP Network, allowing you to “spray” particles from your fingertips.
(y) Return to Recipes & Projects | (y) Return to TouchDesigner | (y) Return to Home