Spot is an RWI B-14 robot with 3 cameras mounted on a Directed Perception pan/tilt mount. The outer (smaller) cameras are greyscale cameras used for stereo vision. The central camera is a color camera used to detect humans in a cluttered indoor environment. Spot was developed at the Texas Robotics and Automation Center.
    Spot represents core technology for a robo-waiter project. He uses a system of perceptual memory in the skill layer of the 3T architecture to remember and track the positions of multiple people, some of whom will not always be in view. His stereo system is based on correlation of texture data is virtual 3-D regions of space called proximity spaces. More details about Spot and his software architecture are available in:

Integrating a Behavior-based Approach to Active Stereo Vision with an Intelligent Control Architecture for Mobile Robots,
David Kortenkamp, Eric Huber and Glenn Wasson, to appear in Hybrid Information Processing in Adaptive Autonomous
Vehicles, ed. Gerhard K. Kraetzschmar and Gunther Palm, Springer-Verlag, 1998.

Integrating Active Perception with an Autonomous Robot Architecture, G. Wasson, D. Kortenkamp, E. Huber, Journal of
Robotics and Autonomous Systems. 1999. to appear.

A Behavior Based, Visual Architecture for Autonomous Robots, G. Wasson, E. Huber, D. Kortenkamp, CVPR 98
Workshop on Perception for Mobile Agents, 1998, 89-94.


The following section contains movies of Spot and his stereo vision system in action. The first section demonstrates the proximity space system working "on the bench". The section section shows Spot using the system and the final section shows Spot's perceptual memory integrated with the stereo vision data. All movies are in QuickTime format and file size is given in parenthesis.

Proximity Space Movies

The proximity space system is implemented on a 200MHz MMX Pentium running Windows NT 4.0. Two MuTech digitizers are used to capture grey scale images. The proximty spaces themselves represent virtual 3-D spheres that are thought to cling to objects in space. The regions of space within a given proximity sphere are mapped to 2-D regions of left and right stereo regions. The movies of the proximity space system show the left and right stereo images after being processed by a LOG filter. The LOG filter exposes "visual texture" in the image and these texture patterns within the proximty space are searched for in both images to do the correspondence. A series of stereo and motion correlations are performed and the results are fed to a set of behaviors that control the motion of the proximity space. In these movies we see the proximity space tracking the object within it. The colored box in both images represents the proximity space. When the box is red, it has a target and is tracking. When the box is green, it has lost its target and begins searching for a new target. Proximity spaces do not have a model of what thay are tracking and they do not store texture patterns over long periods. They form the basis of a visual system, so Spot must use "higher level" information to fully control their operation.

Basic footage of the proximity space tracking me in a cluttered, indoor environment. (6.2 M)

More footage of proximity space tracking. Note how I am tracked in 3 dimensions. (6.5 M)

Proximity spaces "stick" to whatever is inside of them and since they have no model of what they are tracking, I can "steal" a space from someone else. (4.0 M)

However, the spaces do not cling to any objects that merely occludes their current target. Since the person walking in front of me as at asufficiently different depth, the space remains on my face. (880 K)

This clip demonstrates the search behavior of the proximity spaces. When a proximity space becomes "unoccupied", i.e. not enough correlatable texture is found in the image regions being searched, the space begins to move randomly about in a sphere centered on the target's last known location. At each point, the space tries to find enough texture to begin tracking again. This behavior is quite useful when the space momentarily looses its target. Notice how the box turns green (lost) when I move out of it and red (found) again when it find me again. (1.8 M)

The search behavior is shown again, but this time I loose the proximity space by moving away from the cameras too fast. After being reacquired I move away from the cameras more slowly and the space tracks me. Note Eric Huber in the background trying to distract the system. (4.7 M)

Proximity spaces don't just rack faces. In this clip they track my hands. (6.6 M) 

Proximity spaces work over a range of lighting conditions. In this clip, I step into and out of the shadows while the proximity space tracks me. (10 M)


Spot Movies


In this clip, Spot the robot has been equipped with a proximity space system running offboard on a PC. Here he tracks Eric Huber around the room, trying to remain approximately 6 feet from him. (11 M)

More of Spot following Eric. (11.5 M)


Movies of Spot's Perceptual Memory System


This shows Spot's view of the world and demonstrates some of the capabilities of his perceptual memory system. Proximity spaces are
associated with humans for Spot's robo-waiter application and this clip shows the association being initialized. 
Humans are detected by a simple skin-tone model using the single color camera in the middle of Spot's pan/tilt unit. This technique can only determine two of the three dimensions needed for a proximity space to track effectively and so it produces a vector (azimuth and elevation) that the human should be along. Spot then moves a proximity space along the vector searching for correlatable texture. When the box turns red, the proximity space has begun tracking as normal. Note that the reason I appear to grow larger in the image is that Spot approaches me after instantiating the proximity space. (7.3 M)

Spot's perceptual memory system use markers to remember positions of targets tracked by proximity spaces, even if the targets are not currenly in view. Here, Spot re-directs his attention away from me and instantiates another marker on another human (who
initially has trouble getting Spot's attention). (9.1 M)

An uncut version of both these movies is available here. (15.8 M)

Spot tracks one person for a while and then returns his attention to the original human. Note how the position stored in the marker and the search behavior allow a proximity space to begin tracking me again. (9.3 M)

Now I am tracked for a while and then attention is redirected to the other marker again. Note that the other marker retains the last known position of the human which is different that his original starting position (where the marker was first instantiated). (9.7 M)