Vision Follower (RGB-Depth)¶

Depth-aware target tracking with integrated obstacle avoidance.

The VisionFollowerRGBD is a sophisticated 3D visual servoing controller. It combines 2D object detections with depth information to estimate the precise 3D position and velocity of a target.

Unlike the pure RGB variant, this controller uses a sampling-based planner (based on DWA) to compute motion. This allows the robot to follow a target while simultaneously navigating around obstacles, making it the ideal choice for “Follow Me” applications in complex environments.

How it Works¶

The controller utilizes a high-performance C++ core (VisionDWA) to execute the following pipeline:

1. 3D Projection - Depth Fusion. Projects 2D bounding boxes into 3D space using the depth image and camera intrinsics.
2. DWA Sampling - Trajectory Rollout. Generates candidate velocity trajectories based on the robot’s current speed and acceleration limits.
3. Collision Checking - Safety First. Evaluates each trajectory against active sensor data (LaserScan/PointCloud) to ensure the robot doesn’t hit obstacles while following.
4. Goal Scoring - Relative Pose. Selects the trajectory that best maintains the configured Target Distance and Target Orientation.

Key Features¶

Relative Positioning - Maintain a specific distance and bearing relative to the target (e.g., follow from away at a angle).
Velocity Tracking - Capable of estimating target velocity to provide smoother, more predictive following.
Recovery Behaviors - Includes configurable Wait and Search (rotating in place) logic for when the target is temporarily occluded or leaves the field of view.

Supported Inputs¶

This controller requires synchronized vision and spatial data.

Detections - 2D bounding boxes (Detections2D, Trackings2D).
Depth Image Information - Aligned depth image info for 3D coordinate estimation.
Obstacle Data - LaserScan, PointCloud, or LocalMap for active avoidance.

Configuration Parameters¶

The RGB-D follower inherits all parameters from DWA and adds vision-specific settings.

Name	Type	Default	Description
target_distance	`float`	`None`	The desired distance (m) to maintain from the target.
target_orientation	`float`	`0.0`	The desired bearing angle (rad) relative to the target.
prediction_horizon	`int`	`10`	Number of future steps to project for collision checking.
target_search_timeout	`float`	`30.0`	Max time to search for a lost target before giving up.
depth_conversion_factor	`float`	`1e-3`	Factor to convert raw depth values to meters (e.g., \(0.001\) for mm).
camera_position_to_robot	`np.array`	`[0,0,0]`	3D translation vector \((x, y, z)\) from camera to robot base.

Usage Example¶

from kompass.control import VisionRGBDFollowerConfig

config = VisionRGBDFollowerConfig(
    target_distance=1.5,
    target_orientation=0.0,
    enable_search=True,
    max_linear_samples=15
)

For full system integration including ML model deployment, check the Vision-Depth Tracking Tutorial:

Vision-Depth Tracking Tutorial →