Affordable 3D vision is just about to enter the mass market for consumer products such as video game consoles or TV sets. Having depth information in this context is beneficial for segmentation as well as gaining robustness against illumination effects, both of which are hard problems when dealing with color camera data in typical living room situations. Several techniques compute 3D (or rather 2.5D) depth information from camera data such as realtime stereo, time-of-flight (TOF), or real-time structured light, but all produce noisy depth data at fairly low resolutions. Not surprisingly, most applications are currently limited to basic gesture recognition using the full body. In particular, TOF cameras are a relatively new and promising technology for compact, simple and fast 2.5D depth measurements. Due to the measurement principle of measuring the flight time of infrared light as it bounces off the subject, these devices have comparatively low image resolution (176 × 144 ⋯ 320 × 240 pixels) with a high level of noise present in the raw data.