For recordings of faces and human activities in general we have two facilities: the VideoLab was designed for recordings of human activities from several viewpoints. 6 cameras can record precisely synchronized color videos. It has also been used to create a database of facial expressions and action units and is being employed on recognition of facial expressions as well as in investigations of visual-auditory interactions in communication.
Several commercial 3D scanning systems are used in the ScanLab for capturing shape and color information of faces.
The data is used for producing stimuli for psychophysical experiments and for building models to support machine learning and computer vision algorithms.
 

Face and Motion Capture

VideoLab

The VideoLab, which has been in use since May 2002, was designed for recordings of human activities from several viewpoints. It currently consists of 6 Basler CCD cameras that can record precisely synchronized color videos. In addition, the system is equipped for multi-angle audio recording that is synchronized to the video data. The whole system was built from off-the-shelf components and was designed for maximum flexibility and extendibility. The VideoLab has been used to create a database of facial expressions and action units and is being employed in several projects on recognition of facial expressions and gestures across viewpoints, multi-modal action learning and recognition, as well as in investigations of visual-auditory interactions in communication.

One of the major challenges in creating the VideoLab consisted of choosing hardware components allowing for on-line, synchronized recording of uncompressed video streams. Each of the 6 Basler A302 bc digital video cameras produces up to 26 MB per second (782 × 582 pixels at 60 frames per second) – date that is continuously written onto the hard disk. Currently, the computers have a striped RAID-0 configuration with a total disk capacity of 200 GB each to maximize writing speed. The computers are connected and controlled via a standard Ethernet- LAN.

As there was no commercial software available for control of multi-camera recording setups, we had to develop our own distributed recording software. In addition to the frame grabber drivers, which provide basic recording functionality on a customized Linux operating system, we programmed a collection of distributed, multi-threaded real-time C programs that handle control of the hardware, as well as buffering and write-out of the video- and audio data to hard discs. All software components communicate with each other via standard Ethernet-LAN. On top of this control software, we have implemented a graphical user interface with which one can access the whole functionality of the VideoLab.

This image shows an example recording of the VideoLab (Facial expression recorded from 6 synchronized viewpoints)

ScanLab

Left: ABW scanner setup. Right: Sample data of the ABW structured light scanner

Several commercial 3D scanning systems are used for capturing shape and color information of faces. The data is used for producing stimuli for psychophysical experiments and for building models to support machine learning and computer vision algorithms:

Cyberware Head Scanner
This scanner (Cyberware, Inc., USA), uses laser triangulation for recording 3D data and a line sensor for capturing color information, both producing 512 x 512 data points (typical depth resolution 0.1 mm), covering 360º of the head in a cylindrical projection within 20 s. It was extensively used to build the MPI Head Database (http://faces.kyb.tuebingen.mpg.de/).

ABW Structured Light Scanner
This is a customized version of an industrial scanning system (ABW GmbH, Germany), modified for use as a dedicated face scanner. It consists of two LCD line projectors, three video cameras and three DSLR cameras. Using structured light and a calibrated camera/projection setup, 3D data can be calculated from the video images using triangulation. Covering a face from ear to ear, the system produces up to 900.000 3D points and 18 megapixels of color information. One recording takes about 2 s, thus making this scanner much more suitable for recording facial expressions. It was used extensively for building a facial expression model and for collecting static FACS data.

ABW Dynamic Scanner
Using the same principle of structured light, this system uses a high-speed stripe pattern projector, two high-speed video cameras and a color camera synchronized to strobe illumination. It can currently perform 40 3D measurements/s (with color information), producing detailed face scans over time. Ten seconds of recording time produce 2 GB of raw data.

3dMD Speckle Patter Scanner
This turn-key system (3dMD Ltd, UK) is mainly designed for medical purposes. Using four video cameras in combination with infrared speckle pattern flashes and two color cameras in sync with photography flashes, it can capture a face from ear to ear in 2 ms, making it highly suitable for recording infants and children. This system is used in collaboration with the University Clinic for Dentistry and Oral Medicine for studying infant growth.

Passive 4D Stereo Scanner
With three synchronized HD machine vision cameras (two grayscale, one color), this system (Dimensional Imaging, UK) is capable of reconstructing high-quality dense stereo data of moving faces at a rate of 25 frames/s. Since it is a passive system, high-quality studio lighting can be used to illuminate the subject which results in better color data as well as more subject comfort, compared to the ABW Dynamic Scanner. While care must be taken with focus, exposure and calibration of the cameras, the system imposes fewer limitations on rigid head motion.

 
Last updated: Tuesday, 23.09.2014